This  docur''’n1  Vm  br>cn 
for  public  rr.'  — '■■'I  cclc;  il3 
distribution  is  unliinitod. 


MASSACHUSETTS 
INSTITUTE  OF 
TECHNOLOGY 


MIT/LCS/TR-198 


MULTIPLE-PROCESSOR 
IMPLEMENTATIONS  OF 
MESSAGE-PASSING  SYSTEMS 

Robert  H.  Halstead,  Jr. 

This  research  was  supported  by  the  Advanced  Research 
Projects  Agency  of  the  Department  of  Defense  and  was 
monitored  by  the  Office  of  Naval  Research  under 
contract  number  N00014-75-C-0661 


545  TECHNOLOGY  SQUARE,  CAMBRIDGE,  MASSACHUSETTS  02139 


20.  free  computation.  Tokens,  a novel  construct,  allow  certain  simple  communica- 
tion and  s3mchronlzatlon  tasks  without  Involving  fully  general  side  effects. 

The  network  Implementation  presented  supports  object  references,  keeping  track 
of  them  by  using  a new  concept,  the  reference  tree.  A reference  tree  Is  a group 
of  neighboring  processors  In  the  network  that  share  knowledge  of  a common  object. 
Also  discussed  are  mechanisms  for  handling  side  effects  on  objects  and  strategy 
issues  involved  In  allocating  computations  to  processors. 


NTIS 

OOC  ’ 

UNANNOl'WC'n 

JUSTIFICA 


•tCURITV  CLAMIFieATlON  OF  THIS  »AOK(«IMii  Data  AMaraO 


MIT/LCS/TR-198 


\ MULTIPLE-PROCESSOR  IMPLEMENTATIONS 

\ 

OP  MESSAGE-PASSING  SYSTEMS 

\ 

' by 


Robert  Hunter  Halstead,  Jr. 


MASSACHUSETTS  INSTITUTE  OP  TECHNOLOGY 
Laboratory  for  Computer  Science 


CAMBRIDGE 


MASSACHUSETTS  02139 


I 

i 


MULTIPLE-PROCESSOR  IMPLEMENTATIONS  OF  MESSAGE- PASSING  SYSTEMS 


Robert  Hunter  Halstead,  Jr. 


Submitted  to  the  Department  of  Electrical  Engineering  and  Computer  Science 
on  January  20,  1978  In  partial  fulfillment  of  the  requirements  for 

the  Degree  of  Master  of  Science 


ABSTRACT 


The  goal  of  this  thesis  Is  to  develop  a methodology  for  building  netvM)rks  of 
small  computers  capable  of  the  same  tasks  now  performed  by  single  larger  computers. 

Such  networks  premise  to  be  both  easier  to  scale  and  more  economical  In  many  Instances. 

The  mu  calculus,  a simple  syntactic  formalism  for  representing  message-passing 
computations.  Is  presented  and  augmented  to  serve  as  the  semantic  basis  for  pro- 
grams running  on  the  network.  The  augmented  version  Includes  cells,  tokens,  and 
semaphores,  as  well  as  primitives  for  side-effect-free  computation.  Tokens,  a novel 
construct,  allow  certain  simple  communications  and  synchronization  tasks  without 
Involving  fully  general  side  effects. 

The  network  Implementation  presented  supports  object  references,  keeping 
track  of  them  by  using  a new  concept,  the  reference  tree.  A reference  tree  is  a 
group  of  neighboring  processors  In  the  network  that  share  knowledge  of  a common 
object.  Also  discussed  are  mechanisms  for  handling  side  effects  on  objects  and 
strategy  Issues  Involved  In  allocating  computations  to  processors. 


Name  and  Title  of  Thesis  Supervisor: 

Stephen  A.  Ward 

Assistant  Professor  of  Electrical  Engineering  and  Computer  Science 


Key  Words  and  Phrases: 

Message-passing,  distributed  computing,  actor  semantics,  networks. 


-3- 


ACKNOWLEDOMENTS 


r 


1 

I 


Primary  credit  for  the  eeoompllehments,  though  not  blame  for  the  faults,  of  this 
theaia  must  go  to  my  thesis  supervisor,  Steve  Ward,  who  supplied  crucial  inspiration 
and  managed  to  keep  the  thesis  on  track  while  at  the  same  time  convincing  me 
that  It  was  I who  was  keeping  It  on  track. 

Secondary  credit  must  be  shared  by  the  whole  DSSR  group  at  MIT,  for  upgrad- 
ing and  maintaining  the  UNIX  timesharing  system  on  which  the  thesis  research  was 
performed,  and  on  which  the  thesis  document  Itself  was  prepared.  John  Pershing, 
Tom  Teixeira,  Terry  Hayes,  and  other  Individuals  mentioned  In  this  Introduction  all 
contributed  software  that  was  useful  to  me.  I am  grateful  to  all  who  uncomplain- 
ingly toieratad  my  simulation  runs,  many  of  which  decimated  system  performance. 

Important  Intellectual  contributions  arose  out  of  many  discussions  with  Jim  Gula, 
Chris  Terman,  Peter  Jessel,  and  others.  Jim  Quia  In  particular  probably  deserves 
credit  for  steering  my  research  In  this  direction  In  the  first  place. 

Clark  Baker  deserves  special  mention  for  his  careful  reading  of  Chapter  2, 
which  uncovered  several  errors  of  varying  severity. 

The  Department  of  Electrical  Engineering  and  Computer  Science  at  MIT  through 
their  teaching  support  and  the  United  States  Government  through  their  research 
support  have  been  responsible  for  keeping  a roof  over  my  head  thus  far  during  my 
career  as  graduate  student  and  thesis  writer. 

Finally,  1 must  express  my  gratitude  to  my  parents,  not  only  for  their  support 
over  the  years,  but  for  the  use  of  their  quiet  home,  where  much  of  the  most  pro- 
ductive work  on  this  thesis  was  actually  performed. 

This  research  was  supported  by  the  Advanced  Research  Projects  Agency  of 
the  Department  of  Defense  and  was  monitored  by  the  Ofilce  of  Naval  Research 
under  contract  number  NOOO14-76-C-O061. 


TABLE  OF  CONTENTS 


1:  Introduction  j 

1.1:  Parallelism  j 

1 .2:  Networks  and  Distributed  Computing  g’ 

1.3:  Thesis  Overview 
1.3.1:  Message-Passing  Semantics 

1.3.2:  Implementations  .15* 

1.3.3;  Conclusion  <10* 

2:  Message-Passing  Semantics  .| 

2.1 : The  Pure  Mu  Calculus  ^ f 

2.1.1:  Formal  Definition  of  the  Mu  Calculus  tg, 

2.1.2:  Discussion  21 

2.2:  Coding  Applicative  Constructs  23. 

2.2.1:  A Normal-Order  Translation  24. 

2.2.2:  An  Applicative-Order  Translation  26. 

2.2.3:  Other  Reduction  Orders  28 

2.3:  Tokens  2g 

2.4:  Parallel  Evaluation  and  Self-Reference  33’ 

2.4.1:  Parallel  Evaluation  of  Applicative  Expressions  33. 

2.4.2:  Parallel  Evaluation  of  Actors  3g_ 

2.4.3:  Representing  List  Structures  37^ 

2.4.4:  Self-Reference  33" 

2.4.6:  Recursion 

2.4.6:  Conclusions  Regarding  the  Use  of  Tokens  42. 

2.6:  Cells 

2.6.1:  Informal  Description  of  Cells  45, 

2.6.2:  A New  Axiom  Scheme  for  the  Mu  Calculus  45! 

2.6.3:  Discussion  gg’ 

2.6.4:  Congruence  of  States  52' 

2.6.6:  Conclusions  Regarding  the  State  Model  54. 

2.6:  General  Synchronization  Operators  55. 

2.6.1:  Semaphores  gg* 

2.6.2:  Construction  of  an  Arbiter  g/’ 

2.6.3:  Conclusions  Regarding  Semaphores  30! 

2.7:  Coding  imperative  Constructs  gl‘ 

2.6:  Summary  gg* 

3:  Implementations  gg 

3.1:  The  Basic  Approach  gg| 

3.1.1:  Semantic  Structure  gg^ 

3.1.2:  Physical  Structure  jq 

3.2:  Overview  of  System  Operation  /g^ 

3.2.1:  Objects  and  Object  References  75. 

3.2.2:  Dynamics  of  the  System  gg, 

3.3:  Object  Management  gg^ 

3.3.1;  Reference  Trees  gg* 

3.3.2:  Reference  Tree  Maintenance  g2. 


Contents  g. 


3.3.2.1:  Changas  In  Rafaranca  Traa  Mambarship  93. 

3.3.2.?:  Changes  In  Objoct  Text  Custody  103. 

3.3.3:  Garbaga  Coilactlon  10/. 

3.3.4:  Managamant  of  Mutable  Objects  111. 

3.3.4.1 : Managaaiant  of  Tokens  112. 

3.3.4.2:  Managamant  of  Cells  1 20. 

3.3.4.3:  Summary  122. 

3.3.6:  Altarnativa  Rafaranoa  Traa  Algorithms  123. 

3.3.6. 1:  Disconnecting  Rafaranca  Trees  123. 

3.3.6.2:  Reorganizing  Rafaranca  Trass  124. 

3.3.6:  Summary  126. 

3.4:  Event  Distribution  Strategy  127. 

3.4.1:  The  QFUDQE  Strategy  128. 

3.4.2:  Improved  Event  Distribution  Strategies  131. 

3.4.3:  Storage  Managamant  Strategy  133. 

3.4.4:  Conclusions  Regarding  Event  Distribution  136. 

3.6:  Conclusions  Regarding  Our  haplamantatlon  1 37. 

4;  Conclusions  and  Directions  for  Future  Work  130. 

4.1 : Alternatives  to  Our  Design  141. 

4.2:  The  Mu  Calculus  143. 

4.3:  Implementations  1 44. 


A:  Correctness  of  the  Membership  Protocol  146. 

A.1:  The  ProtocoFTastlng  Program  146. 

A.2:  Tasting  the  Mambarship  Protocol  146. 


Rafarancas 


170. 


6. 


vOfiCVntS 


i-, 

(■ 

i 


I 


Chapter  1 : Introduction 


The  goal  of  this  thesis  is  to  develop  a methodology  for  building  networks  of 
small  computers  capable  of  the  same  tasks  now  performed  by  single  larger  comput* 
era.  An  Important  reason  for  preferring  a properly  designed  network  over  a single 
processor  is  the  ease  of  scaling  the  network  configuration  up  or  down  by  simply 
changing  the  number  of  processorsi  rather  than  undergoing  the  trauma  of,  say, 
switching  to  a more  powerful  central  processor.  Moreover,  at  one  extreme  of  this 
scaling  approach  lies  the  possibility  of  constructing  networks  with  capacity 
exceeding  that  of  any  feasible  single  processor. 

Another  primary  motivation  for  using  networks  instead  of  single  processors  is 
economic.  A collection  of  slower  processors  with  smaller  memories  Is  likely  to  be 
less  expensive  than  a single  processor  with  the  same  aggregate  Instruction  execu- 
tion rate  and  the  same  total  memory  size.  This  economic  argument  is  valid,  how- 
ever, only  If  the  collection  of  smaller  processors  can  be  used  as  effectively  as  the 
large  processor  to  solve  the  problems  of  Interest.  In  practice,  there  is  likely  to  be 
some  overhead  Incurred  in  distributing  the  computation.  Fortunately,  the  economic 
advantages  of  distribution  are  strong  enough  to  outweigh  a modest  penalty  In  this 
department. 

1.1:  Parallelism 

Several  broad  categories  of  research  are  relevant  to  this  thesis.  One  such 
category  Is  the  study  of  parallel  algorithms  and  hardware  configurations.  The  inclu- 
sion of  parallelism  In  an  algorithm  is  significant  because  It  relaxes  the  constraints 
on  how  that  algorithm  must  be  executed  (specifically.  It  relaxes  constraints  on  the 
ordering  of  steps  in  the  algorithm).  This  relaxation  of  constraints  allows  for  new 


Section  1.1:  Parallellam 


7. 


possibi*  implementations.  In  particular,  it  may  make  feasible  the  use  of  special 
parallel  hardware  configurations.  Some  of  these,  like  llllac  IV[3],  are  designud  tor  n 
feirty  narrow  range  of  applications  (numerical  array  processing,  In  this  esse).  Oth- 
ers, such  as  proposed  designs  for  data  flow  machlr}es[27]  aim  to  be  more  generally 
usable. 

The  purpose  of  most  of  these  machines  is  to  speed  the  completion  of  a compu- 
tation by  doing  several  parts  of  it  In  parallel.  This  reduction  In  total  elapsed  time 
for  a computation  Is  also  an  Important  goal  of  this  thesis,  but  there  are  differences 
In  approach.  Rrst,  the  method  to  be  presented  in  this  thesis  Is  generally  suited  to 
hardware  of  relatively  traditional  construction  (e.g.,  microprocessors).  Second, 
much  of  the  parallelism  research  cited  above  has  jFocused  on  very  high-performance 
machines  designed  to  compute  faster  than  any  single  computer  could,  whereas  our 
approach  alms  also  to  be  feasible  at  lower  points  on  the  spectrum.  Third,  special- 
purpose  parallel  designs  tend  to  restrict  the  kinds  of  computations  the  programmer 
can  specify  If  he  wants  to  be  able  to  use  the  full  power  of  the  system.  While  our 
system  (or,  for  that  matter,  even  a single-processor  system!)  cannot  be  totally 
Immune  from  this  sort  of  effect,  an  effort  has  been  made  not  to  exclude  any  partic- 
ular kind  of  programming  tool  (e.g.,  side  effects)  from  consideration.  Although  some 
computations  may  run  faster  than  others,  our  system  will  do  its  best  to  apply  the 
maximum  possible  parallelism  to  all  problems. 


8. 


Chapter  1;  Introduction 


1.2:  Natworks  and  Distributad  Computing 


Another  category  of  research  related  to  this  thesis  deals  with  computer  net- 
works. The  prototypical  large  computer  network  is  the  ARPANET[21];  however,  for 
our  purposes  we  shall  be  more  Interested  in  some  of  the  newer  smalhcomputer  net- 
works, such  as  Ethemets[22].  ring  networks[8],  and  shared  buses[23,36]. 
Research  involving  these  networks  Is  aimed  at  increasing  the  reliability  and  flexibil- 
ity of  computer  systems,  as  well  as  simply  facilitating  communication  between 
existing  computers.  In  most  cases,  unfortunately,  useful  results  In  this  area  have 
been  limited  to  hardware  technology.  No  particularly  successful  software  principles 
have  emerged  for  using  these  networks. 

Such  work  as  has  been  done  on  the  software  aspects  of  computer  networks 
mostly  fails  under  the  rubric  of  "distributed  computing."  Work  of  this  type  has 
been  conducted  by  several  researchers,  notably  Farber[9, 10,26,26]  and  Wulf[37]. 
Distributed  computing  Inevitably  entails  some  amount  of  parallelism,  the  difference 
being  that  in  what  is  usually  thought  of  as  distributed  computing,  the  coupling 
between  parallel  activities  is  looser.  Often  a distributed  system  Is  conceived  of  as 
simultaneously  performing  several  Independent  tasks,  almost  like  a timesharing  sys- 
tem, whereas  students  of  parallelism  are  more  apt  to  concentrate  on  coordinated 
tasks  devoted  to  solving  a single  problem. 

The  "timesharing  system"  aspect  of  distributed  computing  raises  new  questions 
that  transcend  those  simply  involving  parallelism.  A distributed  system  must  face 
Issues  of  coherence,  security,  reliability,  and  capacity  for  growth.  In  many  cases, 
these  Issues  arise  in  distributed  systems  In  a form  substantially  different  from  their 
form  in  centralized  computer  systems.  The  topics  of  security  and  reliability  have 
been  largely  Ignored  In  this  theaia;  however,  coherence  and  capacity  for  growth 

Section  1.2:  Networks  and  Distributed  Computing  8. 


are  very  important  goals  of  the  system  design  we  present. 


In  spite  of  these  common  interests,  the  research  described  here  is  fundamen- 
tally different  from  most  distributed-systems  research.  The  latter  has  been  con- 
cerned mainly  with  ways  of  interconnecting  existing  systems,  or  systems  resem- 
bling them,  with  minimum  disruption.  Such  work  Is  of  great  practical  importance,  but 
the  research  in  this  thesis  is  intended  to  be  more  visionary.  Thus  there  is  no  expih 
clt  attempt  to  deal  with  existing  systems.  Instead,  a new  kind  of  system  is 
designed  from  the  ground  up  (or  perhaps  from  the  sky  down,  as  we  shall  see!). 

Finally,  the  development  of  protocols  for  packet  communication  (e.g.,  on  the 
ARPANET)  is  another  subject  In  the  distributed-computing  area  with  some  relevance 
to  this  thesis.  This  topic  is  not  central  to  the  thesis,  but  research  on  protocols  will 
be  cited  from  time  to  time  in  support  of  the  feasibility  of  the  communication  algo- 
rithms to  be  presented. 

1.3t  Thesis  Overview 

Most  distributed-computing  research  to  date  has  either  consisted  of  hardware 
designers  coming  up  with  neat  new  gadgets,  or  software  designers  trying  to  figure 
out  what  to  do  with  them.  The  philosophy  of  our  research  Is  that  the  emphasis  on 
hardware  technology  In  network  design  has  been  misplaced.  A top-down  approach 
Is  needed;  first  determine  the  software  capabilities  and  organizational  principles 
that  are  desirable  for  a network,  then  pick  a suitable  technology  and  build  the  net- 
work. 

This  thesis  documents  an  attempt  to  design  a network  In  Just  this  manner.  It 
begins  with  a description  of  a simple  language  based  on  the  semantics  of  message 
passing,  and  proceeds  to  show  a multiprocessor  Implementation  of  this  language 


10. 


Chapter  1:  Introduction 


and  explore  its  properties. 


r 

I 


i 

! 


si 


fi 


Although  we  shall  attempt  to  Justify  each  choice,  the  course  of  research  such 
as  this  leads  past  many  arbitrary  decision  points.  No  claim  can  be  made  that  a 
message-passing  language  Is  the  only  reasonable  starting  point.  Similarly,  many 
features  of  the  various  implementations  presented  are  arbitrary;  no  "optimal" 
implementation  has  been  developed.  The  message-passing  language  itself  is  simple 
In  the  extreme.  Although  It  contains  all  the  essential  elements  to  construct  a prac- 
tical system,  any  language  to  be  used  In  real  life  would  obviously  be  considerably 
richer  In  features.  Work  of  this  scope  only  scratches  the  surface  of  what  is  possi- 
ble in  multiprocessor  systems. 

Finally,  the  reader  should  be  alerted  to  the  fact  that  the  implementation 
described  in  this  thesis  has  only  been  tested  in  the  most  preliminary  way.  The 
skeleton  of  the  system  design  has  been  shown  to  function;  that  Is  all  that  can  be 
said  with  certainty.  Much  work  remains  to  be  done  before  it  can  be  determined 
whether  the  approach  developed  here  can  really  serve  as  the  basis  for  an  efficient 
and  useful  system. 

1.3.1:  Message-Passing  Semantics 

Chapter  2 of  the  thesis  addresses  the  first  step  in  our  top-down  design  pro- 
cess; the  choice  of  a semantic  basis  for  the  programming  languages  we  might  ima- 
gine supporting  on  our  system.  Almost  immediately,  we  must  choose  between  two 
approaches:  fairly  standard  process-based  semantics,  and  the  newer  message- 
passing outlook. 

In  the  process  model,  the  state  of  a computation  is  thought  of,  in  general,  as 
composed  of  the  contents  of  state  variables  (program  counter,  environment  pointer) 
associated  with  each  of  the  various  processes  that  currently  exist  in  the  system. 


Section  1.3.1:  Message-Passing  Semantics 


11. 


In  this  modal,  communication  and  synchronization  betwaen  parallel  processes  is  fre- 
quently a problem,  and  the  emphasis  on  sequentiality  within  each  process  (even 
though  other  processes  may  run  In  parallel)  combines  with  the  real  or  imagined 
overhead  of  creating  and  destroying  processes  to  discourage  the  expression  of 
parallelism  possible  on  the  microscopic  Cstatement”  or  "expression")  level.  Thus 
the  process  model  encourages  or  forces  the  programmer  to  make  arbitrary  deci- 
sions about  microscopic  sequencing  of  computations.  Once  the  decisions  have 
been  made.  It  can  be  hard  to  rediscover  the  valid  alternatives. 

At  first  glance,  the  process  model  might  seem  well  suited  for  use  on  a network 
of  processors-simply  divide  up  the  processes  among  the  processors  available,  rely- 
ing on  a scheduler  in  each  processor  to  arrange  for  the  sharing  of  that  processor 
among  the  processes  assigned  to  it.  The  problem  with  this  idea  is  that,  using 
existing  formalisms,  the  programmer  Is  not  likely  to  create  very  many  processes 
which  can  all  be  active  at  the  same  time.  This,  In  turn.  Is  likely  to  lead  to  a suc- 
cession of  local  Imbalances,  with  the  bulk  of  the  processing  load  descending  In  turn 
on  one  small  set  of  processes  (and  processors)  after  another.  In  fact,  the  current 
state  of  the  art  in  multiprocessor  systems  seems  largely  to  consist  of  distributing 
processes  among  processors  by  hand  in  such  a fashion  as  to  minimize  Imbalances. 

In  response  to  this  situation.  Chapter  2 presents  a language  and  semantic 
structure  which  encourage  the  expression  of  microscopic  parallelism,  which  limit  In 
a natural  way  the  amount  of  storage  directly  accessible  at  any  particular  point  in 
execution  (so  that  a particular  locus  of  control  can  be  economically  moved  between 
processors),  and  which  help  solve  the  problems  of  communication  and  synchroniza- 
tion between  co-operating  activities.  This  semantic  structure  will  be  based  on  the 
message-passing  model  of  computation,  exemplified  by  the  actor  systems  of 
Hewitt[1 4,16,10]  and  Alan  Ka/s  SMALLTALK[10].  This  approach  to  computing 


12. 


Chapter  1:  Introduction 


appears  to  be  fundamentally  different  from  the  "applicative”  model,  typified  by  the 
lambda  calculus  of  Alonzo  Church  and  the  programming  language  LISP,  and  the 
"Imperative"  model,  embodied  In  the  Von  Neumann  machine  and  most  traditional  pro- 
gramming languages.  Although  Hewitt’s  PLASMA,  for  instance,  contains  numerous 
constructs  of  an  applicative  or  Imperative  flavor.  It  differs  from  conventional 
approaches  In  that  the  underlying  semantics  of  all  these  constructs  are  specified 
exclusively  In  terms  of  message  passing. 


i 


i 

i 


i 

(! 

\ 


i 

1 


In  a message-passing  system,  an  event  Is  defined  as  the  receipt  of  a message 
by  an  actor.  When  an  actor  receives  a message.  Its  script  is  invoked,  which  may 
In  turn  cause  other  events.  This  simple  message-passing  control  structure  can 
replace  procedures,  iteration,  sequencing,  and  so  on.  The  entire  state  of  a compu- 
tation at  any  point  may  be  summed  up  by  the  set  of  events  that  remain  to  be  pro- 
cessed, the  system  event  list.  Of  course,  this  Is  only  true  if  we  regard  the  pro- 
cessing of  an  event  as  an  atomic.  Indivisible  operation;  when  an  event  is  removed 
from  the  event  list,  any  events  It  might  cause  to  be  added  to  the  event  list  will 


I 

appear  In  the  same  quantum  of  time.  Whenever  we  look,  we  see  the  system  "at 
rest,"  and  its  entire  future  potential  is  a function  only  of  the  contents  of  the  event 


list. 


The  "mu  calculus"  of  Ward  and  Halstead[35]  is  the  starting  point  in  our 
development  of  a semantic  foundation,  and  forms  the  basis  for  what  we  shall  call 
"pure"  message-passing  systems  with  no  "side  effects."  In  these  systems,  not 
only  have  the  traditional  control  structures  of  function  application.  Iteration,  etc., 
been  dispensed  with,  even  the  notion  of  "process"  is  no  longer  required. 

Pure  message-passing  systems  are  a useful  introductory  framework  for  study- 
ing various  aspects  of  message  passing,  but  are  too  restrictive  to  serve  as  models 
for  many  Important  situations.  Consequently,  Chapter  2 will  be  concerned  with 


Section  1.3.1;  Message-Passing  Semantics 


13. 


extending  the  message-passing  language  so  that  it  contains  the  semantic  con- 
structs Important  for  most  programming  tasks.  This  will  be  done  by  introducing  two 
new  kinds  of  objects:  cells  and  tokens.  Cells  provide  the  basic  mechanism  for 
side  effects. 

Tokens,  which  resemble  the  tokens  described  by  Henderson[‘l3],  can  be  used 
for  many  of  the  same  purposes  as  ceils,  but  avoid  some  of  the  problems  of  imple- 
menting cells  on  distributed  systems.  A token  Is  like  a "pipe"  for  communication 
between  events  that  are  separated  in  time  or  space.  Initially,  the  pipe  is  empty, 
but  if  an  object  is  fed  Into  one  end  of  the  pipe,  it  will  be  communicated  to  any 
objects  that  are  present  at  the  other  end.  Although  they  cannot  be  used  to  imple- 
ment cells,  tokens  can  be  used  to  create  self-referential  structures  and  solve  some 
simple  synchronization  and  communication  problems. 

Chapter  2 concludes  with  a look  at  primitives  for  mutual  exclusion.  Throughout 
the  chapter,  examples  show  the  relationship  between  this  message-passing 
language  and  more  traditional  applicative  and  Imperative  languages.  The  semantics 
of  the  message-passing  language  Itself  are  described  in  as  abstract  and  formal  a 
manner  as  feasible,  with  relatively  little  reference  to  possible  implementations.  The 
chapter  is  Intended  to  be  complete  In  itself,  for  the  convenience  of  those  readers 
whose  main  Interest  is  in  understanding  the  message-passing  language  and  its 
extensions. 


The  purpose  of  developing  the  mu  calculus  is  twofold.  On  the  one  hand,  the  mu 
calculus  Is  designed  to  serve  as  the  semantic  basis  or  "machine  language"  for  a 
distributed  system.  Actual  programming  would  almost  certainly  be  done  In  a highly 
sugared  version  of  the  mu  calculus  or  In  some  language  with  a completely  different 
appearattce.  Consequently,  the  mu  calculus  itself  need  not  be  a masterpiece  of 
human  engineering;  rather,  It  should  be  (and  Is)  a simple  but  representative 


14. 


Chapter  1 : Introduction 


•mbodiment  of  the  various  operations  that  a distributed  system  must  be  capabie  of. 
The  second,  and  related,  purpose  of  introducing  the  mu  calcuius  is  as  an  attempt  to 
capture  the  essence  of  message-passing  computation  in  a simple  formalism.  The 
conclusion  of  Chapter  2 includes  comments  relating  to  the  success  of  this 
endeavor. 

1.9.2>  Implementations 

Chapter  3 of  the  thesis  describes  an  implementation  of  the  message-passing 
language  on  a network  of  processors.  The  physical  structure  chosen  Is  one  in 
which  each  processor  has  a limited  number  of  neighbors  (e.g.,  four)  with  which  it 
can  communicate  directly.  Communication  between  processors  that  are  not  immedi- 
ate neighbors  must  be  handled  by  one  or  more  intermediary  processors.  The  sys- 
tem supports  object  references[4]  and  uses  an  event  list  distributed  among  the 
various  processors  to  keep  track  of  the  state  of  pending  computations.  A system 
standard  external  representation  for  data  is  assumed,  though  representations 
inside  processors  may  differ.  Garbage  collection  and  mutable  objects  are  sup- 
ported. Some  attention  Is  paid  to  the  problem  of  scheduling  events  for  maximally 
efficient  operation. 

The  main  contribution  of  this  chapter  Is  In  the  object  management  algorithms 
discussed  in  Section  3.3,  particularly  the  reference  tree  concept  and  the  manage- 
ment of  mutable  objects.  The  very  Important  strategy  issues  treated  in  Section 
3.4,  the  reader  is  once  again  advised,  must  still  be  regarded  as  open  questions. 
Satisfactory  performance  of  the  whole  system  must  await  better  solutions  to  these 
problems  than  those  actually  triad  thus  far. 

Since  It  attempts  to  give  a complete  description  of  the  implementation,  the  nar- 
rative in  Chapter  3 Is  occasionally  encumbered  by  a level  of  detail  greater  than 


Section  1 .3.2:  Implementatiorw 


16. 


some  readers  will  want.  Such  readers  may  skim  or  skip  Sections  3.3.4. 1 and  3.4.1 
with  no  loss  of  continuity. 

Chapter  3 relies  only  on  the  most  fundamental  of  the  concepts  set  forth  In 
Chapter  2;  readers  primarily  interested  In  implementations  can  consider  themselves 
qualified  to  move  on  to  Chapter  3 as  soon  as  they  feel  they  have  a very  basic 
understanding  of  actors,  events,  tokens,  and  cells.  In  particular,  comprehension  of 
the  translation  rules  or  examples  given  in  Chapter  2 Is  not  required. 

1.3.a<  Conclusion 

Chapter  4 of  this  thesis  presents  conclusions  and  suggestions  for  further 
research.  It  reviews  the  thesis's  attempt  at  a top-down  system  design  effort, 
discusses  briefly  some  alternatives  to  the  design  given  here,  and  points  out  some 
directions  for  continued  work  on  the  mu  calculus  and  distributed  system  design. 


ie. 


Chapter  1:  Introduction 


Chapter  2t  Maasaga-Passing  Samantics 


t; 


Tha  goal  of  this  chapter  la  to  develop  a semantic  structure  able  to  make 
effective  use  of  a suitably  organized  network  of  processors.  We  begin  with  an 
extremely  simple  message-passing  language. 

2.1:  The  Pure  Mu  Calculus 

The  pure  mu  calculus  Is  a basic  formalism  for  representing  message-passing 
computations.  It  is  "pure"  In  that  It  contains  no  mechanism  for  causing  aide 
effects;  thus  all  "objects"  representable  In  the  calculus  are  immutable.  Such  a 
language,  akin  to  the  lambda  calculus  (often  used  as  a simple  model  of  appHcaV  /e 
languages),  lends  Itself  easily  to  formal  treatment  with  fairly  well-established  tech- 
niques. In  fact,  as  we  shall  see,  the  mu  calculus  Is  like  a restriction  of  the  lambda 
calculus  in  which  certain  kinds  of  expressions  are  not  allowed.  The  price  we  pay 
for  this  ease  of  formal  treatment  Is  that  the  mu  calculus  Is  too  simple  a language  to 
provide  a satisfactory  model  for  many  Interesting  situations.  Nevertheless,  we 
study  the  pure  mu  calculus  before  going  on  because  it  exposes  the  basic  philoso- 

i 

I 

phy  and  character  of  message-passing  systems,  as  well  as  providing  a framework 
for  subsequent  embellishments.  It  should  be  added  that  the  ease  of  formal 
description  of  the  pure  mu  calculus  Is  accompanied  by  a great  deal  of  flexibility  In 

I 

I 

how  to  Implement  a distributed  system  based  on  this  calculus.  In  fact,  one  result  | 

I 

of  this  thesis  is  to  document  a certain  correlation  between  formal  tractabillty  and  j 

flexibility  of  Implementation.  This  correlation  appears  to  be  much  stronger  on  distri- 
buted systems  than  in  centralized  systems.  For  example,  side  effects,  which  are 
difficult  to  handle  formally,  cause  no  particular  Implementation  problems  on  central- 
ized computer  systems,  but  can  cause  quite  a bit  more  difficulty  on  a network. 


Section  2.1:  The  Pure  Mu  Calculus 


17. 


Thus  the  pure  mu  calculus,  although  not  very  useful  by  Itself,  forms  a distinctly 
useful  subset  of  a fuller  message-passing  language,  especially  one  designed  for  a 


distributed  system. 

I 

2.1.1:  Formal  Definition  of  the  Mu  Caicuius 

I The  introduction  stated  that  the  basic  elements  of  a message-passing  system 

[•  are  events,  actors,  and  scripts.  While  the  distinction  between  an  actor  and  a 

script  will  be  useful  to  us  later,  for  now  we  will  Just  represent  an  actor  by  writing 
its  script,  and  will  use  the  concepts  interchangeably.  Other  kinds  of  objects,  such 
as  numbers,  are  also  useful,  so  following  [36]  we  make  the  following  definitions: 

Definition  2.1: 

An  object  is 

1 ) a member  of  a set  C of  (Ustlngulshed  constants  (such  as  numbers), 

2)  a member  of  a set  V of  variables  (for  our  purposes  we  shall  represent 
specific  variables  by  upper-  and  lower-case  Roman  letters), 

3)  a member  of  a finite  set  P of  primitives  (primitive  operations  such  as  addi- 
tion, comparison,  and  so  on),  or 

4)  an  actor  of  the  form  where  each  of  the  Xf  \a  a vari- 

able and  each  of  the  fy  is  an  event.  For  any  n 2 0 and  m 2 0 such  an  ex- 
pression Is  a valid  actor. 

i 

[ 

I Definition  2.2: 

An  event  is  a sequence  (^f^2”‘^n^  objects,  for  any  n 2 1. 

i 

I 

[ 

i 

i 

i 


18. 


Chapter  2:  Mesaag»>Pasalng  Semantics 


In  many  cases,  parentheses  in  events  or  actors  are  redundant  and  may  be  supplied 
from  context;  In  such  cases,  they  may  be  omitted. 

The  above  definitions  establish  objects  and  events  bs  two  of  the  fundamental 
constituents  of  the  mu  calculus.  The  third  major  Ingredient,  which  makes  the  mu 
calculus  a dynamic  system  rather  than  Just  a static  definition  of  objects  and 
events,  Is  the  causality  relation,  which  relates  pairs  of  events.  It  will  be  useful  in 
what  follows  to  distinguish  pairs  of  events  and  objects  that  differ  in  some  struc- 
tural way  from  pairs  that  differ  only  in  the  names  chosen  for  bound  variables. 

Definition  2.3: 

Two  events  or  objects  A and  B are  congruent,  written  A ~ B,  A can  be  cort- 
verted  to  B simply  by  renaming  bound  variables  in  For  this  purpose  we  use  the 
same  definition  of  "bound  variable"  as  Is  used  in  the  lambda  ealculus[6,6]. 

Note  that  congruence  Is  an  equivalence  relation.  For  the  remainder  of  this  thesis, 
we  shall  treat  congruent  events  or  objects  as  If  they  were  the  same.  Thus,  for- 
mally, we  are  working  with  equ/ua/ence  c/asses  of  events  or  objects,  and  represent- 
ing these  classes  by  selected  members. 

We  may  rK>w  discuss  the  causality  relation  — > on  events.  The  causality  rela- 
tion is  defined  by  two  axioms  which  closely  parallel  the  axioms  of  the  lambda  caF 
cuius.  For  the  first  axiom,  which  corresponds  to  beta-reduction  In  the  lambda  cal- 
culus, we  must  have  a syntactic  substitution  function  like  the  lambda-calculus  sub- 
stitution rule  S[X:y;Z]  which  denotes.  Informally,  the  result  of  substituting  the 
expression  X for  all  free  occurrences  of  the  variable  y In  the  expression  Z,  renam- 
ing bound  variables  in  Z If  necessary  to  avoid  Identifier  confiict.  The  mu-calculus 
substitution  function  Is  Identical  to  this  In  every  respect  except  the  replacement  of 
the  letter  X by  the  letter  p,  so  Interested  readers  are  referred  to  [0]  for  the 

Section  2.1.1:  Formal  Definition  of  the  Mu  Calculus  18- 


details.  In  defining  the  mu  calculus.  It  Is  frequently  useful  to  be  able  to  express 
the  result  of  substituting  several  expressions  for  several  variables  at  the  same 
time.  We  write  this  as  This  la  equivalent  to 

S[X.,  ;y, 

provided  that  none  of  the  y^  occurs  free  In  any  of  the  Xj,  which  can  always  be 
arranged  by  renaming  the  offending  y,  and  similarly  renaming  all  free  occurrences  of 

y/  ‘n  z. 

We  are  now  prepared  to  state  the  axioms  that  define  causality; 

Axiom  A1  (mu-reducHon)i 

If  f la  the  event  ^ ^ event 

r of  the  form  S[4.|.42.— where  1 S / S /». 


Axiom  A2  (prlmltlvs)'. 

If  £ Is  the  event  (pc.jC2“*c^)  where  p Is  a primitive  and  each  Cy  Is  a con- 
stant, then  f — > (4  p[c,;C2; -sc^).  where  p Is  the  function  denoted  by  p. 

It  Is  often  convenient  to  use  the  transitive  closure  of  the  relation  following 
usual  mathematical  practice,  we  denote  this  by  — >*.  Similarly,  we  denote  the 
refiexh/e  transitive  closure  of  -->  by 


20. 


Chapter  2:  Message-Passing  Semantics 


2.1.Z:  Discussion 


The  pure  mu  calculus  has  been  shown  to  be  a consistent  and  universal  comput- 
ing acheme[36].  Let  us  examine  In  more  detail  the  various  elements  of  this 
scheme. 

Events  are  the  basic  mechanism  by  which  things  "happen"  In  the  mu  calculus. 
An  event  denotes  the  arrival  of  a message  containing  the  sequence  of 

objects  receiver  object  A^.  Two  kinds  of  objects  are  meaningful  as 

receivers:  actors  (with  meaning  formally  specified  by  axiom  A1)  and  primitives 
(dealt  with  by  axiom  A2). 

Objects  represent  the  data  of  the  mu  calculus.  Distinguished  constants  may 
be  used  to  model  numbers  and  other  fundamental  data  types  whose  semantics 
need  not  be  further  specified  within  the  axiom  system  of  the  mu  calculus.  Primi- 
tives stand  for  the  basic  operations  on  these  constants.  Identifiers  are  place- 
holders;  when  an  actor  which  binds  a particular  occurrence  of  an  Identifier 
receives  a message,  the  occurrence  of  the  Identi.ler  Is  replaced  by  the 
corresponding  object  from  the  message.  Finally,  actors  are  the  principal  mechanism 
for  abstraction  in  the  mu  calculus.  The  body  of  an  actor  may  contain  events  which 
are  prototypes  for  the  computations  that  will  ensue  whenever  the  actor  receives  a 
message. 

Events  can  cause  other  events;  s computation  In  the  mu  calculus  generally 
proceeds  from  some  Initial  event,  through  various  events  caused  by  this  event  and 
its  descendants,  to  some  desired  hial  event  or  events  caused  (In  the  sense  of 
— >")  by  the  initial  one.  Axioms  A1  and  A2  specify  the  mechanisms  by  which  new 
events  may  be  caused,  and  in  so  doing  give  meaning  to  actors  and  primitive 
objects. 

Axiom  A1  deais  with  actor-events  (events  whose  receivers  are  actors)  and,  as 


Section  2.1.2:  Discussion 


21. 


J 


has  already  been  mentioned,  bears  a close  correspondence  to  axiom  beta  of  the 
lambda  calculus.'  The  significance  of  axiom  A1  Is  that  If  the  actor 
(iur.|X2**'X^>f l£'2"'^/n^  receives  a message  then  m new  events  can  be 

caused,  one  corresponding  to  each  of  the  in  the  body  of  the  actor,  with  the 
appropriate  object  Aj  substituted  for  each  free  occurrence  In  Ef  of  each  bound 
variable  Xy  of  the  actor.  For  example: 

(pxc.-fxxcJSR  causes  -t-ddR 
0ie.(‘*>3dc)(c4))R  causes  -t-SdR  and  R4 
0txc.)7R  causes  no  event 

Thus  an  actor  with  one  event  in  its  body  will  cause  one  new  event  when  it 
receives  a message,  while  an  actor  with  more  than  one  event  In  Its  body  will  cause 
several  events.  An  actor  with  an  empty  body  will  terminate  the  particular  line  of 
computation  from  which  It  was  sent  a message.  In  this  way,  the  pure  mu  calculus 
Includes  mechanism  for  spawning  and  terminating  concurrent  activities.  It  unfor- 
tunately does  not,  as  we  shall  see.  Include  any  mechanism  for  communication 
between  concurrent  activities. 

Axiom  A2  provides  for  certain  primitive  functions  which  operate  on  the  dis- 
tinguished constants  of  the  computing  scheme.  The  exact  nature  of  these  func- 
tions is  not  Important,  and  we  will  Invent  new  ones  wherever  convenient.  The  only 
restriction  is  that  a primitive  function  must  have  only  sequences  of  constants  in  its 
domain.  Examples  of  the  kinds  of  functions  that  are  useful  are 


22. 


Chapter  2:  Meeaage-Passing  Semantics 


J 


mfiC  which  causes  (C  4-fl) 

*aBC  which  causes  (C  4+fi),  and 

>ABC  which  causes  (C  sabc.ca)  \f  a > B,  and  causes  (C  nabc.cb)  if  A i.  B. 


Thus 

#8R  causes  R9 

♦34R  causes  R7 

>67R  causes  Raabc.cb 

>62|iX.x10R  causes  Oix.x10R)(iiabc.ca) 

2.2:  Coding  Applicative  Constructs 

Since  message-passing  Is  a relatively  novel  and  unconventional  approach  to 
computing,  It  Is  appropriate  to  describe  Its  relationship  to  more  familiar  approaches. 
Some  literature  on  this  topic  has  been  written  by  Hewltt[16],  but  the  work 
described  here  is  even  more  radical  than  his,  since  he  still  allows  some  applicative 
constructs  to  be  present  In  his  message-passing  language.  This  work  Is  also  car* 
ried  out  at  a more  fundamental  level  than  Hewitt’s,  simplifying  many  of  the  analo- 
gies. Consequently,  several  sections  in  this  chapter  will  consider  translations 
between  mir-calculus  constructs  and  those  of  other  programming  methodologies.  In 
this  section,  we  focus  on  applicative  languages,  such  as  the  lambda  calculus  or 
Pure  LISP. 


Section  2.2:  Coding  Applicative  Constructs 


23. 


1 


i 


2.2.1:  A NormahOrdar  Translation 

A rule  for  translating  applicative  expressions  into  objects  is  stated  and  proved 
in  [36];  we  give  it,  without  proof,  below.  Central  to  the  translation  is  the  concept 
of  a continuation,  as  described  by  Strachey  and  Wadsworth[32]  or  Hewitt[15].  in 
an  applicative  language,  the  use  to  which  the  value  of  an  expression  will  be  put  Is 
determined  by  the  context  in  which  that  expression  appears.  In  the  mu  calculus, 
there  Is  no  such  thing  as  an  "expression."  Values  are  computed  and  used  by  means 
of  events  causing  other  events.  In  the  mu  calculus,  there  is  also  no  "context"  (in 
the  applicative  sense)  for  a computation.  Thus  a value  that  is  computed  by,  say, 
adding  two  numbers,  must  be  disposed  of  in  a manner  specified  by  another  part  of 
the  same  event  which  caused  the  addition.  The  prevailing  ethic  for  doing  this  kind 
of  computation  In  a message-passing  language  is  to  use  a continuation,  an  actor 
which  will  receive  the  value  produced  by  a computation  and  which  will  then  con- 
tinue, using  that  computed  value  where  appropriate. 

The  translation  rule  we  present  takes  an  applicative  expression  A and  pro- 
duces an  object  0[^]  with  the  property  that  If  0[>1]  Is  sent  some  continuation  C,  C 
will  at  some  point  be  sent  the  value  of  the  original  applicative  expression  A 
(appropriately  translated  to  the  mu-calculus  domain).  Thus 

C^3]  C — >"  C3 
0[+26]  C “>"  C7 
0[Xx.x]  C -->*  C(mxc.cx) 


1 

i 

\ 


1 


j 


F 


24. 


Chapter  2:  Message-Passing  Semantics 


Definition  2.4: 


- - Given  any  applicative  expression  4,  the  object  0[4]  Is 

1 . If  4 is  a constant  n,  then  Hc.cn. 

2.  If  4 is  an  identifier  x,  then  x. 

3.  If  4 is  a X-expresslon  \x.^"•x^.M,  then  mc.c(mx.|— x^c.(0[/lf]  c))  where  c Is  an 
Identifier  that  does  not  occur  free  In  M and  Is  not  on  the  list 

4.  If  4 is  a combination  PO^—0^,  then  mc.(0[P]  ny.iy  OCO-j]  — ^[0^]  c)),  where 
c is  an  identifier  that  does  not  occur  free  in 

5.  if  4 is  a unary  primitive  (such  as  v),  then  MC.c(aad.a(MX.px(MZ.(Me.ez)d))), 

I where  p is  the  mu-calculus  primitive  corresponding  to  4. 

6.  if  4 is  a binary  primitive  (such  as  * or  >),  then 
lic.c(|iabd.aOtx.b(My.pxy(|iz.(Me.ez)d)))),  where  p Is  the  mu-calculus  primitive 
corresponding  to  4. 

Translations  for  other  primitive  operators  are  similar  to  those  specified  in  clauses  6 
and  6. 

As  shown  In  [35],  the  existence  of  this  translation  rule  establishes  the  univer- 
sality and  consistency  of  the  mu  calculus.  Perhaps  more  importantly  for  the  pur- 
poses of  this  work,  it  gives  a clue  as  to  how  various  useful  lambda-calculus  dev- 
ices, such  as  the  Y operator,  may  be  adapted  for  use  in  the  mu  calculus. 

The  translation  rule  O models  in  the  mu  calculus  the  semantics  of  normal-order 
reduction  on  applicative  expressions.  An  Interesting  attribute  of  the  mu  calculus  is 
that  it  can  be  used  to  model  other  reduction  orders  as  well  (see  [6]  for  a discus- 
sion of  reduction  orders).  This  is  because  the  order  In  which  events  are  caused  In 
the  mu  calculus  is  highly  constrained  by  the  requirement  that  axioms  can  only  be 
applied  to  entire  events,  not  to  subexpressions  of  events.  Thus  the  next  computa- 


Section  2.2.1:  A Normal-Order  Translation 


26. 


tion  is  the  one  that  is  on  the  top  ievel  of  an  event;  subsequent  computations  are 
specified  by  subexpressions  which  are  nested  more  deepiy  inside  the  event.  This 
degree  of  explicitness  can  be  both  an  asset  and  a liability,  as  we  shall  see. 

2.2.2:  An  Applicativa-Order  Translation 

A reduction  order  that  is  used  more  commonly  than  normal-order  reduction  is 
appllc^ive-order  reduction.  In  which  the  values  of  arguments  are  computed  before 
the  function  to  be  applied  to  them  is  invoiced.  The  use  of  applicative-order  reduc- 
tion usually  simpliltes  an  Interpreter  and  increases  its  efficiency  relative  to  normal- 
order  reduction;  this  accounts  for  Its  use  in  LISP  and  similar  languages.  It  is  pos- 
sible to  describe  translation  rules  from  applicative  expressions  to  message-passing 
expressions  which  produce  results  corresponding  to  applicative-order  reduction  on 
the  applicative  expression  being  translated. 

One  such  possible  rule  Is  (for  simplicity,  we  treat  only  single-argument  func- 
tions) 


20. 


Chapter  2:  Measege-Passing  Semantics 


Oaflnitlon  2.5: 

Given  any  applicative  expression  A,  the  object  0’[4]  Is 

1 . If  /I  is  a constant  n,  then  itc.cn. 

2.  If  A is  an  identifier  x,  then  itc.cx,  where  c is  some  identifier  other  than  x. 

а.  If  ^ Is  a X-expresslon  \x.M,  then  <ic.c(mxc.(0’[M]  c))  where  c is  some 
Identifier  other  than  x which  does  not  occur  free  in  Af. 

4.  If  4 Is  a combination  PQ,  then  ac.(0’[P]  My<(0’[0]  MZ.yzc)),  where  c is  an 
Identifier  that  does  not  occur  free  In  A and  y is  an  Identifier  that  does  not 
occur  free  in  O. 

б.  if  4 is  a unary  primitive  (such  as  •),  then  sc.cp,  where  p is  the  mu-calculus 
primitive  corresponding  to  4. 

Comparison  of  this  definition  with  definition  2.4  reveals  that  the  evaluation  of 
an  argument  expression  X has  been  transferred  from  the  application  of  primitives 
(clause  6)  back  to  the  evaluation  of  arbitrary  combinations  (clause  4).  ("Evalua- 
tion" of  X,  in  mu-calculus  terms,  means  the  causing  of  an  event  (0[X]  C)  which  will 
eventually  cause  C to  be  sent  the  value  of  X.)  The  change  in  evaluation  times 
requires  a corresponding  change  in  the  semantics  of  identifiers  In  the  translation  O’. 
In  the  processing  of  events  generated  by  the  translation  rule  O,  identifiers  are 
replaced  by  unevaluated  arguments,  that  Is,  objects  of  the  form  0[X]  for  some  X. 
This  Is  a consequence  of  the  form  of  clauses  3 and  4 of  definition  2.4.  The 
change  In  clause  4 of  definition  2.5  causes  identifiers  to  be  bound  Instead  to 
valutas,  that  Is,  objects  returned  to  the  continuation  C by  events  of  the  form 
(O^X]  C).  Thus  if  we  mean  to  abide  by  our  previously  established  convention  that 
(0^4]  C)  returns  to  the  continuation  C the  value  of  the  applicative  expression  4, 
we  must  define  cy[a]  as  pc.ca  tor  any  Identifier  a.  Thus,  for  example.  If  e is 

Section  2.2.2:  An  Applicative-Order  Translation  27. 


I 


replaced  by  the  value  3,  0’[a]  will  become  mc.c3,  which  Is  consistent  with  our  con- 
vention. 

2.2.3:  Other  Reduction  Orders 

The  translation  rules  presented  above  show  (1)  that  any  computation  expressi- 
ble in  the  lambda  calculus  is  expressible  in  the  mu  calculus,  and  (2)  that  the  mu 
calculus  removes  certain  ambiguities  about  reduction  order  which  are  present  in  the 
lambda  calculus.  Thus  It  is  possible  to  write  different  translation  rules  which 
effectively  specify  different  reduction  orders  for  the  applicative  expressions  being 
translated.  Exponents  of  multlprocessing[2],  though,  have  pointed  out  ways  of 
turning  the  ambiguity  of  the  lambda  calculus  to  advantage,  using  parallel  reduction 
orders  where  several  subexpressions  of  an  applicative  expression  are  evaluated 
slftiuKanaously,  perhaps  by  several  processors  operating  In  parallel,  it  is  not  possi- 
ble to  write  translation  rules  into  the  pure  mu  calculus  which  exhibit  this  kind  of 
behavior,  since  the  pure  mu  calculus  forces  the  explicit  specification  of  the  order 
of  events.  At  first  sight,  the  ability  of  actor  bodies  to  contain  multiple  events  might 
seem  to  provide  the  basic  mechanism  for  this  kind  of  parallel  evaluation,  but  there 
is  no  "Join"  operator  to  co-ordinate  the  results  from  two  parallel  chains  of  events 
and  use  both  to  compute  some  final  output.  The  next  section  describes  an  extern 
slon  to  the  mu  calculus,  called  tokens,  which  permits  the  construction  of  a "Join" 
operator  as  well  as  providing  certain  other  capabilities. 


28. 


Chapter  2:  Measage-Passing  Semantics 


2.3:  Tokans 


A characteristic  of  the  pure  mu  calculus  Is  that  It  forms  a purely  additive  axiom 
system— if  we  denote  by  E*  the  complete  set  of  events  that  could  eventually  be 
caused  by  a set  E of  Initial  events,  then  for  two  sets  of  Initial  events  E^  and  E2> 
(E.|UE2)"  - characteristic  makes  It  Impossible  to  write  any  kind  of 

"Join”  operator  In  the  pure  mu  calculus,  for  such  an  operator  would  allow  us  to 
specify,  for  example,  that  some  event  E was  to  occur  as  a result  of  the 
occurrence  of  both  f .j  and  E2,  but  not  simply  as  a consequence  of  either  occurring 
alone.  Thus  f«{f.j,£'2)"  but  f ^ {£^}"  and  E4{E^*.  This  contradicts  the 
above  observation  that  {f.|,£2}*  ° {^l}’'u{£'2}*  requires  Instead  that 

{£.|}"u{£2}*  ^ several  initial  events  ought  to  be 

able  to  have  consequences  that  would  not  be  caused  by  any  one  of  those  events 
Individually  suggests  that  a new  kind  of  mu-calculus  axiom  Is  needed.  Axioms  A1 
and  A2  showed  how  a single  event  could  cause  other  events.  We  now  see  the 
need  for  axioms  wherein  the  conjunction  of  some  set  of  events  may  cause  other 
events.  This  is  what  a "Join"  primitive  would  do,  and  is  also  a capability  sufficient 
for  certain  other  simple  communication  and  synchronization  tasks.  This  capability  is 
provided  by  extending  the  pure  mu  calculus  to  Include  tokens. 

Oeflnitloii  2.6: 

A token  Is  an  element  of  the  set 

T ■ {<r^,w^>  I X is  an  object  in  the  mu  calculus) 

For  any  particular  token  <rj^,w^>,  Is  known  as  the  reed  side  and  as  the 
write  side  of  the  token. 


Section  2.3:  Tokens 


20. 


Axiom  Aa  (tok»ns)t 


r 

; 

I 

I 

i 


♦ 


The  pair  of  events  and  Wj^B  (where  e T)  together  cause  the 

event  AB. 

Additionally,  a mechanism  must  be  provided  for  generating  tokens. 

Axiom  A4  (creation  of  tokens)'. 

An  event  of  the  form  rC  causes  the  event  Cr^w^. 

A useful  way  of  visualizing  a token  is  as  a pair  of  tables  as  shown  In  Figure 

2.7. 


read  table  write  table 


sx.'^xxR 

3 

R 

4 

6 

* 

« 

• 

• 

Rgure  2.7:  Viewing  a token  as  tables. 


Every  time  the  read  side  of  a token  receives  a message,  the  Item  received  Is 
entered  in  the  first  empty  slot  of  that  token's  read  table;  messages  received  by 
the  write  side  are  similarly  entered  In  the  write  table.  At  any  point,  the  set  of 
events  that  can  have  been  caused  by  sending  messages  to  the  token  Is  exactly 
the  sat  of  all  events  AB  such  that  A appears  in  the  read  table  and  B appears  In 
the  write  table. 

Assuming  that  X Is  the  identifier  of  the  token  depicted  In  Rgure  2.7,  the  figure 
shows  the  state  of  affairs  that  would  exist  after  the  five  events 


! 


1 

1 

I 


3 


I 

1 


J 


ao. 


Chapter  2:  Meeeage-Paeelnq  Semantics 


mx.i-xxR 


f 


r^R 
W;f  3 

had  been  caused,  and  would  In  turn  lead  to  the  eventual  causation  of  the  six 
events 


(itx>xxR)3  R3 

0ix.-fxxR)4  R4 

(^x.'4-xxR)6  R5 


A useful  Implementation  of  tokens  would  of  course  not  wait  for  all  operations  on  the 
token  to  have  been  completed  before  causing  any  of  these  events,  but  rather 
would  generate  the  events  incrementally.  For  example,  every  time  an  entry  was 
added  to  the  write  table  It  could  simultaneously  be  sent  to  each  object  currently 
present  In  the  read  table;  addition  of  an  entry  to  the  read  table  could  be  handled 
similarly. 

From  this  analogy  it  Is  possible  to  Infer  several  properties  of  tokeru: 


1.  If  a token  stops  receiving  messages,  every  value  appearing  in  the  write 
table  (I.e.,  that  was  sent  to  the  write  side)  will  ultimately  be  sent  to  each 
object  appearing  In  the  read  table  (I.e.,  that  was  sent  to  the  read  side). 

2.  A token  never  loses  Informatlon-a  value,  once  received,  remains  entered  in 
the  appropriate  table  forever. 

3.  The  (externally  detectable)  state  of  a token  depends  only  on  the  Identity  of 

Section  2.3:  Tokens  31. 


th«  objects  comfflunlcated  to  the  two  sides  of  the  token,  not  on  the  order  in 
which  the  messages  were  received  (the  position  of  an  item  in  a read  or 
write  table  is  not  signiflcant,  only  Its  presence). 

4.  In  their  full  generality,  tokens  form  a rather  bizarre  control  structure. 

Properties  (1)  and  (2)  have  simple  explanations  in  terms  of  the  axiomatic 
definition  of  tokens.  Property  (3)  can  be  explained  by  noting  that  axiom  A3  does 
not  place  any  ordering  on  messages  received  by  either  side  of  a token;  conse- 
quently, the  ordering  introduced  in  the  "table"  analogy  is  artificial  and  cannot  be 
detected  except  by  examining  the  table.  These  three  properties  Indicate  that 
tokens  are  a special  kind  of  object,  in  a twilight  zone  between  immutability  and 
complete  mutability.  It  Is  possible  to  modify  a token  by  sending  It  a message,  but 
such  a modification  can  only  add  information  to  the  token-old  Information  can  never 
be  destroyed.  Thus,  for  example,  an  old  copy  of  a token  is  guaranteed  to  contain 
a subset  of  the  Information  present  in  the  current  version  of  the  token.  Conse- 
quently, an  old  copy  of  a token  never  becomes  invalid  In  the  sense  of  containing 
Incorrect  information;  it  simply  becomes  less  and  less  useful  as  additional  Informa- 
tion accumulates  in  the  primary  copy  of  the  token.  Tokens  share  this  characteris- 
tic with,  for  example,  eventcounts  as  described  by  Reed  and  Kanodla[24],  where 
an  old  value  of  an  eventcount  Is  never  dangerous;  It  may  Just  be  sufilciently  out 
of  date  to  be  of  little  use  for  the  intended  application.  Such  methods  of  constrain- 
ing the  mutability  of  objects  promise  to  be  of  considerable  use  In  Implementing  dis- 
tributed systems. 

Property  (4)  raises  two  questions:  what  are  tokerw  good  for?  and  how  can 
tokens  be  implemented  practically?  The  first  of  these  questions  Is  addressed  in 
the  next  section.  As  for  the  second,  one  possible  Implementation  is  suggested  by 


32. 


Chapter  2:  Message-Passing  Semantics 


the  "table"  model.  Further  consideration  and  refinement  should  property  awaK  the 
discussion  In  the  next  section  showing  the  typical  ways  in  which  tokens  might  actu- 
ally be  used.  In  fact,  discussion  of  these  implementaticn  details  will  be  postponed 
until  the  next  chapter. 

2.4:  Parallel  Evaluation  and  Self-Reforonco 
2.4.1:  Parallel  Evaluation  of  Applicative  Expressions 

We  now  return  to  the  subject  being  discussed  before  the  Introduction  cf 
tokens,  namely  the  translation  of  a program  expressed  In  an  appllcativa  language 
such  as  the  lambda  calculus  into  an  equivalent  program  expressed  In  the  mu  cal- 
culus. An  objection  that  had  been  raised  to  the  translation  rules  presented  was 
that  they  specified  particular  evaluation  orders 'and  hence  destroyed  some  Informa- 
tion present  In  the  original  regarding  flexibility  of  evaluation  order.  In  particular, 
there  are  some  possibilities  for  concurrency  that  do  not  violate  the  semantics  of 
applicative  expressions,  but  the  pure  mu  caicuius  provides  no  way  of  expressing 
the  possibility  of  such  concurrency  in  the  translation.  This  deficiency  of  the  pure 
mu  calculus  was  shown  to  be  related  to  the  impossibility  of  writing  a "join"  opera- 
tor. We  now  show  (by  example)  that  It  /s  possible  to  express  this  kind  of  parallel 
evaluation  in  the  mu  calculus  with  tokens. 

We  shall  continue  to  restrict  our  attention  to  single-argument  functions;  hence, 
the  kind  of  parallel  evaluation  we  shall  investigate  is  the  evaluation  of  the  operator 
of  a combination  In  parallel  with  that  of  Its  operand.  In  order  to  co-ordinate  these 
activities,  a new  token  will  be  created  every  time  the  evaluation  of  a combination 
Is  begun.  The  read  side  of  the  token  will  be  given  to  the  operator  as  a "future" 
(see  Baker  and  Hewltt[2])  for  the  value  of  the  operand,  and  the  write  side  will  be 
held  for  the  operand  to  send  Its  value  to.  Thus  the  process  of  evaluation  of  the 


Section  2.4.1:  Parallel  Evaluation  of  Applicative  Expressions 


33. 


operator  will,  whenever  It  requires  the  value  of  the  operand,  send  a continuation  to 
the  read  side  of  the  token;  If  the  operand  value  has  been  computed,  the  continua- 
tion will  then  be  sent  that  value.  Otherwise,  that  line  of  activity  will  cease  until 
the  operand  value  becomes  available  (I.e.,  is  sent  to  the  write  side  of  the  token), 
at  which  point  all  continuations  sent  to  the  read  side  of  the  token  will  receive  the 
operand  value.  In  detail,  the  new  translation  rule  is  as  follows: 

Definition  2.S: 

Given  any  applicative  expression  A,  the  object  is 

1 . If  4 is  a constant  n,  then  pc.cn. 

2.  if  4 la  an  IdentHler  x,  then  x. 

3.  If  4 is  a X-expression  XxJlf,  than  |•c.c(|txc.(0''[Af]  c))  where  c is  some 
Identifier  other  than  x wMch  does  not  occur  free  In  /If. 

4.  If  4 is  a combination  PQ,  then  iic.(r0viv.(O*[O]  w)(0*[P]  ay.yrc))),  where  c, 
r,  and  w are  identifiers  that  do  not  occur  free  In  4. 

6.  If  4 is  a unary  primitive  (such  as  v),  then  |ic.c(jiad.a(ax.px(jiz.(fie.ez)d))), 
where  p la  the  mu-calculua  primitive  corresponding  to  4. 

This  translation  rule  cloaely  resembles  the  normal-order  translation  rule  O given  in 
definition  2.4.  In  fact,  other  than  the  restriction  to  single-argument  functions,  the 
only  difference  la  In  the  handling  of  combinations  In  clause  4.  In  this  translation 
rule  O",  as  described  above,  whenever  a combination  Is  to  be  evaluated  a new 
token  la  created  using  the  r operator.  The  resulting  read  and  write  sides  r and  w 
replace  the  variables  r and  w respectively,  and  two  parallel  events  are  caused. 
The  first  initiates  a chain  of  events  culminating  in  the  receipt  by  w of  the  value  of 
the  operand  O;  the  secono  leads  to  the  continuation  ay.yrc  receiving  the  value  of 
P,  presumably  some  sort  of  function,  after  which  execution  of  the  body  of  the 


34. 


Chapter  2:  Message-Passing  Semantics 


function  will  begin,  with  the  formal  parameter  replaced  by  the  read  side  r. 

It  Is  Interesting  to  note  that  any  variable  present  In  the  original  applicative 
expression  A will  appear  In  the  translation  0*[>1]  In  such  a manner  that  It  will  even- 
tually be  replaced  by  the  read  side  of  some  token  (unless  the  variable  Is  never 
replaced  at  all).  Thus  all  situations  which  under  ordinary  evaluation  would  call  for  a 
variable  In  A to  be  bound  to  a value  will  instead  result  in  the  variable  being  bound 
to  a future  for  that  value.  In  this  sense,  the  translation  0"\A'\  represents  a maxi- 
mally parallel  evaluation  of  A:  every  evaluation  begins  as  soon  as  It  can,  and  no 
evaluation  waits  for  another  until  a primitive  requiring  the  value  of  some  future  is 
invoked. 

An  amusing  property  of  this  scheme  is  that  if  A has  any  operands  with  no  nor- 
mal form,  then  even  though  0"[A'\  may  eventually  cause  the  value  of  A to  be 
returned  to  its  continuation,  some  sub-evaluations  spawned  by  will  never 

return  any  value  to  their  continuations.  In  a straightforward  Implementation  of  the 
mu  calculus,  these  evaluations  will  continue  foreverl  Even  if  these  evaluations  ter- 
minate, but  Just  take  a long  time,  they  are  still  bothersome.  The  problem  of  identi- 
fying and  "garbage-collecting"  these  irrelevant  computations  is  dealt  with  by  Baker 
and  Hewitt[2]  who  use  a somewhat  different  concept  of  "future"  than  that 
described  here,  however.  Fortunately,  any  expression  whose  applicative-order 
evaluation  (as  defined  by  definition  2.6)  terminates  is  guaranteed  not  to  leave  any 
nonterminating  evaluations  under  the  rule  O".  Unfortunately,  restricting  ourselves 
to  this  class  of  expressions  rules  out  some  Interesting  possibilities,  such  as  parallel 
evaluation  of  the  two  arms  of  a conditional  before  It  Is  determined  which  arm  is 
desired. 


Section  2.4.1:  Parallel  Evaluation  of  Applicative  Expressions 


36. 


2.4.2:  Parallal  Evaluation  of  Actors 

So  far,  we  have  been  devoting  considerable  attention  to  translation  rules  from 
applicative  expraaakms  to  mu-calculus  objects.  This  has  not  been  in  an  attempt  to 
demonstrate  the  feasibility  of  a system  which  performs  this  translation  before 
evaluating  an  applicative  expression  (although  that  Is  one  possible  goal),  but  an 
attempt  to  show  the  fairly  straightforward  relationship  that  exists  between  applica- 
tive programming  and  message-passing  programming.  The  aim  has  been  to  show 
that  the  message-passing  model  combines  the  ability  to  specify  fixed  orders  of 
evaluation  with  (when  augmented  by  tokens)  the  ability  to  specify  many  useful 
forms  of  concurrency.  As  another  illustration  of  this  property,  we  consider  the  con- 
struction of  a mircalculus  parallelism  actor  ir. 

The  behavior  we  desire  is  that  an  event  wABC  will  ultimately  cause  the  event 
CXy,  where  X is  the  value  A returns  to  Its  continuation  and  Y Is  the  value  B returns 
to  Its  continuation  (l.e.,  for  any  B,  AB  — >*  BX  and  BB  — >"  BY).  Thus  if  A and  8 
contain  two  calculations  which  may  be  carried  out  In  parallel,  w arranges  for  them 
to  happen  concurrently,  and  after  both  have  finished,  for  the  resulting  values  to  be 
forwarded  to  C.  Using  tokens,  w may  be  Implemented  as 

V = ltdbc.r«irw.(aw)(bay.rpx.cxy) 

The  Idea  is  that  the  value  X will  be  computed  and  sent  to  the  write  side  of  the 
token,  while  the  computation  of  Y is  occurring  simultaneously.  When  Y Is  computed. 
It  is  sent  to  the  continuation  ay.rax.Cxy  which  leads  to  the  event  rax.CxY,  where  r 
Is  the  read  side  of  the  token.  When  both  sides  of  the  token  have  received  their 
respective  messages,  the  event  (px.CxY')X  will  then  be  caused,  leading  in  turn  to 
the  desired  consequence,  CXY. 

V Is  an  attractive  package  for  this  functionality,  and  might  well  be  the  user’s 


36. 


Chapter  2:  Message-Passing  Semantics 


main  Interface  with  tokens  In  a programming  language  based  on  the  mu  calculu  s.  As 
an  example  of  what  r might  be  good  for,  consider  two  functions  f and  g,  am  1 sup- 
pose we  wanted  to  compute  the  sum  f(3)*g(A),  evaluating  the  two  terms  In  paral- 
lel. In  mu-calculus  parlance,  this  would  become 

r(|ic.f3c)0ic.g4c)(|ixy.4xyR) 

Although  v can  be  constructed  using  tokens,  the  author  has  discovered  mo  way 
to  construct  a token  using  w and  believes  this  task  to  be  impossible.  This  acc:ounts 
for  the  decision  to  axiomatize  tokens  rather  than  the  more  obvious  but  seemingly 
less  general  construct  w. 

2.4.3:  Representing  Ust  Structures 

In  preparation  for  discussing  another  application  of  tokens,  we  digress  Isriefly 
to  consider  how  LISP-llke  list  structures  might  be  represented  In  the  mu  cal  cuius. 
The  fundamental  LISP  list-structure  operator  Is  the  constructor  function  cons,  which 
takes  two  items  and  binds  them  together  into  one.  In  message-passing  terms, 

(eons  4 8 C)  — >*  (C  X) 

where  X contains  within  it  the  arguments  A and  8.  We  may  write  cons  as 

eons  = pabc.c(|ix.xab) 

In  which  case 

(cons  ABC)  — > (C  px.x48) 

The  constructor  function  cons  Is  complemented  by  two  selector  functions  cair  and 
edr,  expressed  as 


Section  2.4.3:  Representing  Ust  Structures 


37. 


car  = Mxc.x0tab.ca) 
cdr  = axc.xOtab.cb) 

which  select  out  the  first  and  second  components,  respectively,  of  an  item  con- 
structed by  cons.  Cast  into  message-passing  form,  the  familiar  identities  regarding 
the  relationship  between  car,  cdr,  and  cons  are 

(cons  A B MX.Ccar  x C))  — >*  CA 
(cons  A B ax.(cdr  x C))  — >"  CB 

Verification  that  these  identities  are  true  of  our  implementation  is  left  to  the 
reader. 

2.4.4:  Self-Reference 

A capability  of  tokens  which  Is  quite  independent  of  their  usefulness  in  parallel 
evaluation  is  their  use  to  create  self-referential  structures.  In  the  lambda  calculus, 
a kind  of  seif-reference  can  be  achieved  by  using  the  Y operator 

Y = Xg.(Xh.g(hh))(Xh.g(hh)) 

Thus  In  the  expression  Y(X/.F),  free  occurrences  of  the  Identifier  f In  the  expres- 
sion F effectively  refer  to  the  expression  F Itself.  This  device  is  obviously  avail- 
able In  the  pure  mu  calculus  by  simply  using  the  translation  rule  in  definition  2.4. 

Tokens  provide  a similar  capability  in  a way  which  is  somewhat  less  magical 
than  the  machinations  of  the  Y operator.  Using  tokens  is  likely  to  be  more  efficient 
(depending  on  the  Implementation  of  tokerm)  and  is  certainly  more  straightforward. 
As  a simple  example,  let  us  consider  the  manufacture  of  a circular  list  whose  ear 
contains  some  datum  A and  whose  cdr  is  the  list  Itself.  In  the  pure  mu  calculus,  an 


% 


38. 


Chapter  2:  Message-Passing  Semantics 


object  such  as  this  would  have  to  be  constructed  with  the  help  of  some  actor 
derived  from  the  Y operator.  Using  tokens,  we  can  construct  the  list  (and  send  It 
to  our  result  continuation  R)  starting  with  the  following  event: 


rOirw.(cons  A w 0*x.(R  x)(r  x)))) 

The  chain  of  events  that  will  ensue  is  as  follows  (abbreviating  by  Z the  token 
identifier  |irw.(cons  A w (px.(R  x)(r  x)))): 

(eons  A (tix.(R  x){r^  x))) 

(mx.(R  x)(r^  x})(mx.x  a w^) 

(R  (mx.x  a w^))  and  (r^  Oix.(x  A w^r))) 


Let  us  use  B to  denote  the  actor  itx-xAvt^  received  by  R.  Then  (car  B C)  will 
eventually  cause  (C  A),  as  we  would  hope,  and  (cdr  B C)  will  cause  (C  w^).  Let 
us  see  what  happens  If  this  value  is  ever  passed  to  one  of  the  list-structure 
selectors,  say  car. 

(car  C) 

— > (iiab.Ca) 

-->  (/ix.x/lw2)(«iab.Ca) 

— > (pab.Ca)>1w^ 

— > CA 

The  only  tricky  part  of  the  above  sequence  occurs  between  the  second  and  third 
lines,  where  the  message  sent  to  Is  sent  in  turn  to  the  actor  mx.xAw^.  This 
happens  because  that  was  the  (only)  value  sent  to  r^  when  the  self-referential 
structure  was  created.  If  the  operator  In  this  example  had  been  cdr  Instead  of 
ear,  then  the  last  line  above  would  have  been  Cw^  instead  of  CA.  Thus  we  really 


Section  2.4.4:  Self-Reference 


30. 


hav*  managed  to  capture  the  essence  of  a list  structure  which  loops  back  on 
Itself. 

Unfortunately,  this  example  only  happens  to  work  so  nicely  because  the  object 
returned  by  eons  Is  a single-argument  actor.  In  any  other  case,  the  current  token 
mechanism,  which  only  allows  the  sending  of  single-element  messages  through  the 
pipe,  would  have  been  Inadequate.  This  is  an  Indication  of  a kind  of  Inflexibility  of 
tokens  that  we  shall  discuss  again  after  seeing  how  tokens  can  be  used  to  Imple- 
ment recursion. 

2.4.5:  Recursion 

Throughout  this  section,  we  will  assume  that  the  recursive  function  of  interest 
Is  deflnad  as  a recursive  kernel  F In  which  free  occurrences  of  the  Identifier  f 
represent  recursive  references  to  F.  An  example  of  such  a definition  for  the  fac- 
torial function  would  be 

F = Mnc.>n1|tt.(t0ic.-n1pm.fnuix.«nxc)0ic.c1)0ia.ac)) 

From  F we  can  construct  a "functional''  F*  in  which  f is  bound: 

F*  £ pfc.cF 

Now.  using  an  approach  similar  to  that  pursued  in  the  last  section,  we  can  write  the 
event 


rOirw.(F"Oixc.riig.gxc)Oth.(Rh)(wh)))) 

which  sends  to  R an  object  F which  is  the  recursive  function  intended  by  F (i.e., 
the  least  flxed  point  of  the  functional  F*).  Abstracting  out  the  functional  f"  and 
the  continuation  R,  we  can  derive  a kind  of  mu-caiculus  Y operator  using  tokens 


40. 


Chapter  2:  Message-Passing  Semantics 


as 


= afc.rOirw.CfOtxc.rag.gxc)(Mh.(ch)(wh)))) 

A formal  statement  of  the  abovementioned  fixed-point  property  la  that  for  any  con- 
tinuation C and  functional  f *, 

(Y^  C)  CF 

wrhere  F is  the  least  fixed  point  of  F*. 

As  mentioned  In  the  previous  section,  recursion  is  possible  even  In  the  pure  mu 
calculus,  by  translation  from  lambda-calculus  recursion  If  by  no  other  route.  For 
comparison,  then,  here  is  an  equivalent  operator  Y^  which  does  not  use  tokens: 

Y^  = afc.4Ac  where  A = Mgc.f(MXC.ggah.hxc)c 

Unlike  the  lambda-calculus  Y operator,  however,  which  can  compute  the  least  fixed 
point  of  an  arbitrary  expression,  these  operators  Y^  depend  on  F being  a single- 
argument  function  (strictly  speaking,  on  F being  an  actor  that  accepts  two-element 
messages).  Thus  perhaps  each  of  these  should  really  have  been  labelled  Y^.|  to 
emphasize  this  restriction.  It  is  of  course  possible  to  construct  a Y^2»  version 
using  tokens  is 


Y^2  = |ifc.r(arw.(f(«ixyc.rag.gxyc)(iih.(ch)(wh)})) 

Ne  yertheless,  it  is  annoying  to  have  to  have  all  these  different  versions  of  Y.  Or«e 
solution  to  this  problem  is  to  change  the  semantics  of  the  original  recursive  kernel 
F so  that  free  occurrences  of  f refer  not  to  F Itself  but  to  some  object  (such  as 
the  read  aide  of  a token)  which  will  return  F to  its  continuation.  Using  these 
modified  semantics,  the  recursive  kernel  for  the  factorial  function  would  become 


Section  2.4.6:  Recursion 


41. 


F s jinc.>n1>it.(t(iic.-nlMin.^Mg.giniix.>nc)(MC.c1)(Ma.ac)) 


The  functional  F*  la,  as  before,  defined  to  be 

F"  = sfc.cf 

Now  we  may  define  a generalteed  mu-calculus  fixed-point  operator  (using  tokens) 

as 


Y^  = Mfc.r(|irw.(fr(sh.(ch)(wh)))) 

The  equivalent  operator  In  the  pure  mu  calculus  is 

Y^  = pfc^c  where  A = sgc.t(itc.ugc)c 

2.4.6:  Conclusions  Regarding  the  Use  of  Tokens 

One  conclusion  to  be  drawn  from  the  above  discussion  of  self-reference  and 
recursion  Is  that  tokens  slight  be  more  useful  If  It  were  possible  to  send  other  than 
abigle-eiement  messages  from  the  write  side  to  the  read  side  of  a token.  For 
example,  we  slight  consider  changing  axiom  A3  to 

Axiom  AS’  (extended  tokMM)t 

The  pair  of  events  and  ^ **  same  object  In  both 

eases)  togWmr  cause  the  event 

With  this  new  axiom,  the  use  of  tokens  for  recursion  would  become  simpler  and  the 
objections  raised  in  connection  with  generating  self-referential  list  structures  (that 
it  was  only  by  accident  that  tokens  turned  out  to  be  directly  applicable)  would  be 
answered.  However,  there  la  no  Increase  in  power  as  a result  of  this  change.  For 
the  token  described  in  axiom  A3*  to  be  used  productively,  any  object  A sent  to  the 


42. 


Chapter  2:  Meessge-Pasaing  Semantics 


rMd  side  of  the  token  had  better  be  of  the  appropriate  functionality  to  receive  an 
/t’element  message.  Similarly,  all  messages  received  by  the  write  side  had  better 
have  n elements,  otherwise  they  will  not  tit  the  actors  waiting  on  the  read  side. 
Given  that  n Is  thus  etfectivety  fixed  for  any  particular  token,  we  can  use  our 
knowledge  of  n to  simulate  a new-style  token  using  an  old-style  token  as  follows: 

= pa.r^v.va 

= »ib^b2-b„.w;(aa.ab,b2-b^ 

where  and  Wy  are  the  sides  of  the  now  "token,"  and  fy  and  Wy  are  the  sides 
of  a token  as  described  In  axiom  A3.  The  principal  advantage  of  defining  tokens  as 
in  axiom  A3’  is  that  the  value  of  n need  not  bo  known  beforehand  and  therefore 
more  generalized  actors  can  be  constructed,  for  example  a Y operator  which  can 
perform  the  function  of  any  of  Y^.|,  otc.,  depending  on  what  Is  required, 
without  forcing  the  user  Into  the  extra  complication  (and  inefficiency)  of  our  solu- 
tion Involving  Y^.  As  a feature  of  a programming  language,  then,  this  extra  flexibil- 
ity would  be  of  considerable  merit.  As  a feature  of  a formalism  for  studying  compu- 
tation, such  as  the  mu  calculus,  its  value  is  more  questionable.  For  the  purposes  of 
this  research,  then,  we  stick  to  the  simpler  definition  of  tokens  given  in  axiom  A3. 
Our  approach  to  this  problem  resembles  the  use  of  Curried  functions  In  the  lambda 
calculus,  whereby  all  computations  can  be  expressed  using  only  single-argument 
lambda-expressions. 

Another  issue  regarding  tokens  Is  whether  the  full  generality  of  being  able  to 
have  arbitrarily  long  read  and  write  tables  is  of  any  use.  The  driving  force  behind 
adopting  that  definition  Is  that  It  simplifies  the  formal  properties  of  tokens.  If,  for 
example,  tokens  were  defined  so  that  they  could  only  be  written  once  (as  they  are 
In  Henderson[13]),  but  could  be  read  any  number  of  times,  the  question  would  arise 


Section  2.4.0:  Conclusions  Regarding  the  Use  of  Tokens 


43. 


as  to  what  to  do  If  two  attampta  ware  made  to  write  a given  token.  If  the  deci- 
sion were  made  that  the  second  write  would  "fall,”  then  "second"  would  have  to 
be  deflnad,  which  would  be  dlfllcult  If  the  two  events  were  not  ordered  with 
respect  to  each  other  by  the  causality  relation.  Any  useful  definition  of  "second" 
would  introduce  Indeterminacy  Into  the  system,  whereas  the  current  token  system 
Is  perfectly  deterministic. 

A second  approach  is  to  look  at  the  uses  we  have  found  for  tokens.  The 
parallel  evaluation  examples  generally  Involved  a token  being  written  at  most  once 
and  read  some  number  of  times;  the  self-reference  example  involved  reading  once 
and  writing  many  times  (each  time  causing  the  read  to  complete  again  with  a 
different  value!);  and  the  recursion  examples  given  wrote  once  and  read  many 
times.  No  example  was  given  where  the  read  and  write  sides  of  a single  token 
could  each  receive  more  than  one  message.  However,  a parallel  evaluator  (as  in 
definition  2.8)  for  a multiple-valued  applicative  language  such  as  Ward’s  EITHER-K 
theory[34]  would  use  this  property  of  tokens.  This  may  be  of  limited  Interest  In 
practical  cases,  but  serves  to  add  credence  to  the  claim  that  the  theoretical 
approach  chosen  Includes  the  proper  amount  of  generality. 

The  third  attgle  from  which  tokens  could  be  viewed  is  from  the  perspective  of 
possible  Implementations.  It  will  be  seen  in  the  next  chapter  that  there  are  indeed 
reasonable  ways  of  bnpleinentlng  the  fully  general  token  mechanism  without  exces- 
sive overhead. 


44. 


Chapter  2:  Meaaaqe-Paaalng  Semantics 


2.5>  Calls 


While  tokens  extend  the  power  of  the  mu  calculus,  they  certainly  do  not 
enable  us  to  model  all  Interesting  computations.  In  order  to  exhibit  an  actor  Imple- 
mentation of  a file  system  or  a data  base,  some  mechanism  for  causing  side 
effects,  that  is,  permanently  changing  the  state  of  an  object  In  a way  that  des- 
troys Information  about  previous  states  of  the  object,  is  almost  essential.  We  shall 
argue  at  various  points  In  this  thesis  that  the  explicit  use  of  cells  should  be  minim- 
ized, that  other  mechanisms  with  more  confined  kinds  of  mutability  (such  as  tokens) 
can  often  be  used  more  efficiently  in  distributed  systems,  but  the  fact  remairts  that 
this  approach  will  not  suit  all  situations.  Consequently,  we  introduce  the  concept 
of  a ce//[11]  as  the  means  by  which  arbitrary  side  effects  may  be  achieved. 
Unfortunately,  the  introduction  of  cells  adds  considerable  complication  to  the 
axiomatic  description  of  the  mu  calculus,  so  we  begin  Informally. 

2.6.1:  Informal  Description  of  Cells 

Somewhat  In  the  same  way  as  we  viewed  tokens,  we  view  a cell  as  being 
composed  of  two  ports:  an  update  port  and  a contents  port  c^,  where  x is  a 
unique  identifier  for  that  cell.  The  update  port  accepts  messages  with  two  com- 
ponents: a new  value  1/  and  a continuation  C.  After  updating  the  ceil  to  have 
value  V,  the  event  (C)  is  caused.  The  purpose  of  including  this  continuation  Is  that 
It  can  contain  computations  that  should  not  begin  until  the  update  has  been  com- 
pleted. The  contents  port  of  the  cell  simply  accepts  a continuation  C anr<  causes 
the  event  CV,  where  V is  the  current  value  of  the  cell.  Similar  to  the  r operator  for 
creating  tokens,  we  postulate  a # operator  for  creating  cells,  such  that  fl/C 
causes  for  some  previously  unused  cell  identifier  x.  The  initial  value  of  the 


Section  2.6.1:  Informal  Description  of  Cells 


46. 


cell  will  be  I/. 


The  notion  of  a ceil  having  two  ports  as  described  above  is  somewhat  uncon- 
ventional; it  Is  motivated  by  the  desire  that  all  transactions  with  a cell  be  by 
sending  messages  to  K,  rather  than  being  mediated  by  special  operators  which  we 
would  have  to  invent.  Even  so,  one  can  envision  various  devices  by  which  an 
update  message  to  a cell  could  be  distinguished  from  a value  message,  but  at  the 
level  of  this  work  It  Is  worth  little  to  engineer  such  ad  hoc  solutions.  A user  who 
was  annoyed  by  continually  having  to  pass  around  both  ports  of  a cell  could  easily 
design  an  actor  which  would  package  them  Into  one  unit.  In  any  case,  we  make  no 
claim  that  the  structure  presented  here  is  superior,  only  that  It  is  simple  and  ade- 
quate for  our  purposes. 

2.8.2:  A New  Axiom  Scheme  for  the  Mu  Calculus 

In  the  foregoing  sections,  we  have  tacitly  assumed  that  in  a useful  implementa- 
tion of  an  Interpreter  for  the  mu  calculus,  every  event  that  in  theory  could  be 
caused  by  the  Initial  sat  would  eventually  occur.  In  the  absence  of  cells,  such  an 
interpreter  would  be  deterministic  In  the  sense  that  the  same  events  would  always 
result  from  a particular  initial  set  of  events.  Ceils,  however,  introduce  the  possibil- 
ity of  nondeterminism.  If  more  than  one  ordering  of  the  operations  on  a cell  is  pos- 
sible, then  different  orderings  may  give  rise  to  different  results.  This  is  not  the 
same  situation  that  arises  in,  say,  handling  of  multiple  actor  bodies,  which  may 
reault  In,  for  example,  R receiving  the  value  3 and  also  the  value  4.  This  latter 
situation  may  arise  as  the  natural  consequence  of  some  multiple-valued  computa- 
tion; the  former  refers  to  the  possibility  of  an  entire  computation  (including  all  con- 
current events)  taking  one  of  two  alternative  routes  w/i/c/>  are  not  cons/stont  with 
one  snother.  For  example,  one  might  be  characterized  by  the  value  of  a particular 


48. 


Chapter  2:  Meeaage-Paseing  Semantics 


call  being  0,  the  other  by  the  value  of  that  cell  being  1-lt  is  no*:  possible  for  a cell 
to  have  two  values  at  the  same  time. 


h 

I. 


The  Introduction  of  objects  such  as  cells  in  which  state  changes  performed  in 
different  orders  can  produce  different  results  requires  some  radical  changes  In  our 
thinking.  For  example.  It  is  no  longer  useful  to  talk  about  the  closure  E*  of  an  ini- 
tial event  set  E as  containing  all  the  events  that  could  be  caused  by  events  in  E, 
because  some  of  these  events  may  be  inconsistent  with  each  other  (e.g.,  be  based 
on  different  assumptions  about  the  value  of  a cell).  Even  if  two  event  sets  E.|  and 
E2  each  have  closures  which  do  not  contain  Inconsistencies  (I.e.,  lead  to  deter- 
minate computations),  their  union  may  not,  due  to  interactions  between  cell  opera- 
tions performed  by  E.|  and  E2.  Finally,  in  the  presence  of  cells,  it  becomes  possi- 
ble to  detect  multiple  executions  of  an  event,  leading  to  questions  about  the 
status  of  multiple  copies  of  an  identical  event.  Of  course,  all  of  this  Is  just  a 
rehash  of  the  traditional  problems  encountered  In  trying  to  construct  a sound 
semantic  basis  for  a system  Incorporating  both  parallelism  and  side  efrects[  11]. 

The  approach  we  take  here  Is  based  on  a suggestion  by  Henry  Baker  that 
what  Is  needed  is  not  a relation  on  events,  as  is  provided  by  our  current  axiom 
system,  but  a relation  on  partial  execution  histories,  where  an  execution  history 
would  be  said  to  cause  all  legal  execution  histories  of  which  it  was  an  initial  sub- 
string. Instead  of  execution  histories,  however,  we  will  consider  system  states, 
which  are  basically  execution  histories  with  the  portions  which  are  no  longer 
relevant  discarded.  A system  state  will  be  expressed  as  a set  of  events;  the 
axioms  will  describe  what  new  states  could  be  caused  by  the  current  state.  Thus, 
for  example,  if  a state  S included  the  event 

E = OiXy.(Rx)(Ry))  3 4 

Section  2.5.2:  A New  Axiom  Scheme  for  the  Mu  Calculus  47. 


then  a new  state  S’  that  could  be  caused  by  S would  be  a state  which  contained 
all  events  In  S except  for  f,  and  additionally  contained  the  two  events  R3  and 
fl4  which  are  caused  by  C.  The  fact  that  f was  part  of  the  execution  history  is 
not  retained  in  S’  because  it  can  have  no  bearing  on  any  subsequent  state  transi- 
tions. We  already  have  the  consequents  we  want  from  E,  and  in  a system  with 
cells  it  would  be  dangerous  to  produce  them  again,  so  at  best  we  could  retain  a 
copy  of  £ In  S'  which  was  specially  marked  as  already  having  been  processed. 
Since  we  have  no  use  for  this  information,  we  choose  to  delete  it  instead. 

t 

To  avoid  confusion  with  the  event-causality  relation  ~>,  we  denote  the  state- 
causality  relation  by  *■>,  using  the  related  notation  ■>>'*’  for  the  trartsitive  closure 
of  *■■>  and  Ba>"  for  the  reflexive  transitive  closure.  The  new  set  of  mu-calculus 
axioms  (up  to  and  Including  cells)  la  than 

Axioin  B1  (mu-rBductlon)t 

If  f ■ (S-{f>)uS*,  where 

S’  * I 


Axioffl  B2  (prlmltltf9s)i 

If  £ ■ (pc.|C2**‘C’^)  a S.  where  p is  a primitive  and  each  c^  Is  a conatant,  then 
S ■■>  (8-{£)>j{(P  p[c.|  iC2:*‘*;c^)).  where  p Is  the  function  denoted  by  p. 


48. 


Chapter  2:  tHeaeage  Pasaing  Semantics 


Axiom  B3  (to^s)t 


r 


This  axiom  comes  In  two  parts: 

«)  It  £■  a 4)  • 8,  then  S ■»>  (S-{£>)u{(w^y  4)>,  where  y is  a new  identifier 
that  does  not  appear  in  any  event  In  S. 

b)  It  C « .a„*t'^xy  «>  = *ben 

8 a=>  (8-C)u{(rj^  provided  that  y does  not  ap- 

pear on  the  Hat  *i 

Axiom  84  (creet/on  of  tohans)-. 

It  f « (rC)  e 8,  then  8 =■>  (S-{f>)u{(Crj^wr^)>,  where  x Is  a new  identifier  that 
does  not  appear  in  any  event  in  8. 

Axiom  85  (updating  cells): 

If  C » {(u^  4)’.(Uj^  8C))  s 8,  then  8 «=>  (8-C)u{(u^  fi)’.(C)>. 

Axiom  86  (reading  cells): 

If  {(u^  Ay.(c^  C)>  c 8,  then  8 »»>  (S-{(Cj^  C)})u{(C/l)>. 

Axiom  87  (creation  of  cells): 

If  f - (#1/0  e S,  then  8 «>  (S-{£»u{(u^  i/)’.(Cc^Uj^)>,  where  x is  a new 
Identifier  that  does  not  appear  In  any  event  in  S. 


Section  2.6.2:  A New  Axiom  Scheme  tor  the  Mu  Calculus 


J 

I 

i 

J 


2.6.3:  OiseuMlon 


Axioms  B1  and  B2  ara  fairly  straightforward  adaptations  of  axioms  A1  and  A2 
to  usa  tha  stata-cauaaHty  concapt  alraady  discussed.  Axiom  B3,  however,  contains 
erwugh  complication  to  bring  out  most  of  the  subtle  points  of  the  new  scheme.  The 
first  point  concerrM  duplication  of  events.  Strictly  speaking,  the  states  referred  to 
above  are  not  sets  of  events,  but  rather  collections  of  events  (in  the  sense 
defined  by  Hewitt  for  PLASMA[16])  in  which  duplicates  are  allowed.  If  some  state 
contained  two  copies  of  the  event  R3,  we  assume  that  the  Intent  was  in  fact  that 
the  value  3 should  be  printed  twice.  For  a more  sophisticated  example,  consider  a 
state  containing  two  copies  of  the  event 

c^(Ma.aaOib.u^b(M.Rb))) 

In  this  case,  It  Is  at  least  plausible  that  the  cell  x might  be  incremented  twice 
(although  without  any  kind  of  mutual  exclusion  operator  we  cannot  be  sure  this  will 
be  the  result).  In  any  case,  a state  with  two  copies  of  that  event  is  clearly  a 
different  entity  from  a state  with  some  other  number  of  copies.  Thus  the  state 
union  6uT  dertotes  a state  containing  a copy  corresponding  to  every  event  In  S as 
well  as  a copy  corresponding  to  every  element  in  T,  and  the  state  difference  8-T 
dermtes  a state  which  is  like  S except  one  copy  of  every  event  in  T has  been 
removed. 

Qiven  our  concern  that  the  correct  number  of  copies  of  each  event  should 

r 

appear  in  a state,  tokens  clearly  present  a problem.  An  incorrect  possibility  for 
axiom  B3  vrould  be  to  adapt  axiom  A3  to  the  state  model,  yielding,  "If 
{(r^  A),(w^  B)}  a B,  then  8 ■*>  8u{(48)}.”  Unfortunately,  this  could  lead  to  an 
unbounded  number  of  copies  of  the  event  (M)  being  added  to  8,  where  we 
intended  that  only  one  should  be  added.  The  situation  is  further  complicated  by 


60. 


Chapter  2:  Meeaeg^asaing  Semantics 


the  tact  that  If  S contained,  say,  one  copy  of  (r^  A)  but  two  copies  of  (w„  B), 
then  we  would  want  exactly  two  copies  of  (AB)  to  be  generated.  Our  desire  to 
ensure  that  exactly  one  event  (>W)  will  be  generated  for  each  pair  {(r^^  B)} 

that  exists  In  S leads  us  to  tag  one  of  the  events  In  the  pair  (we  have  picked 
(Wj^  fl),  though  the  choice  Is  completely  symmetrical)  with  an  additional  unique 
identifier  y,  so  if  in  fact  there  are  two  copies  of  (w^  B)  in  S,  each  will  eventually 
be  assigned  a different  additional  identifier  before  any  other  use  can  be  made  of 
the  event.  Thus  part  (a)  of  axiom  B3  removes  this  possible  source  of  ambiguity  by 
eliminating  the  possibility  that  two  distinct  events  In  which  Is  sent  some  object 
could  look  Identical. 

Part  (b)  of  axiom  Bd  contains  the  mechanism  by  which  the  token  actually  does 

Its  work.  The  events  that  come  Into  play  here  are  the  write-event  (w  B)  gen- 

xy 

erated  by  part  (a)  from  (w.,  B)  and  the  read-event  {r„  A)^  ^ . . The  (Initially 

empty)  list  of  subscripts  a.|a2— Is  the  list  of  additional  identifiers  y of  the  write- 
events  with  which  the  read-event  has  already  interacted.  This  list  is  kept  to 
ensure  that  no  read-event  ever  interacts  with  the  same  write-event  more  than 
once.  Since  the  Identifiers  <4ppearing  in  this  list  are  unique  among  all  write-events, 
the  proper  semantics  will  be  maintained  for  multiple  copies  of  the  same  object  sent 
to  the  same  token.  In  particular,  if  some  state  contained  two  copies  of  the  event 

(w^  B),  some  subsequent  state  would  instead  contain  the  two  distinct  events  j 

j 

(Wj|,y  8)  and  (w^^  8),  and  each  would  thus  have  one  chance  to  interact  with  each 

read-event  A).  Consequently,  two  copies  of  the  event  (4B)  would  be  generated  ^ 

J 

from  each  such  read-event,  as  desired.  When  a read-event  (r  A)  I 

X a^ag-.-a^  ^ 

Interacts  with  a write-event  (w^^  B).  not  only  Is  the  consequent  (AB)  generated, 

but  the  read-event  Is  replaced  in  the  subsequent  state  by  (r„  A).  . . .,-that  is, 

X a.,a2-a^  j 

a cvtpy  of  the  read-event  with  y appended  to  its  list  of  subscripts,  preventing  that 


Section  2.6.3:  Discussion 


61. 


raad-event  from  ever  interacting  with  the  same  write-event  again. 

Axiom  B4  is  much  more  stralghtforward-it  simply  provides  for  the  replacement 
of  the  token  creation  event  (rC)  by  Its  intended  consequent  (Cr^w^),  where  x Is 
the  unique  Identifier  to  be  assigned  to  the  token. 

Axioms  B6,  86,  and  B7  deal  with  celis,  and  use  a special  convention  to  record 
the  current  contents  of  a cell  in  the  state.  The  presence  of  the  primed  "event" 
(u^  A)'  in  a state  • indicates  that  the  value  of  cell  x In  state  S Is  A.  The  axioms 
are  written  so  that  for  every  cell  known  In  S there  is  at  all  times  exactly  one  such 
value-event.  In  light  of  this  convention,  the  cell  axioms  are  seen  to  be  quite  sim- 
ple. An  update  event  (w^  BC)  simply  replaces  the  oid  value-event  for  the  cell  with 
the  new  value-event  (u^^  B)'  and  generates  the  appropriate  consequent  (C)  as  an 
indication  that  the  update  has  been  performed.  A contents  event  (c^  C)  causes 
the  current  value  of  cell  x to  be  sent  to  C.  Finally,  the  cell-creation  event  (#UC) 
causes  the  Initial  value-event  (u^  1/)'  along  with  the  consequent  (Cc^u^^)  containing 
the  names  of  the  two  ports  of  the  cell. 

2.6.4:  Congruence  of  States 

Just  as  the  notion  of  congruence  (as  defined  in  definition  2.3)  was  useful  for 
determining  whether  or  not  two  superficially  different  objects  or  events  were  in 
fact  functionally  distinct.  It  will  be  convenient  to  define  a congruence  relation  on 
states.  Obviously  it  is  convenient  to  call  two  states  S and  T congruent  if  they  are 
collections  of  congruent  events:  in  other  words,  if  there  is  a one-to-one  correspon- 
dence between  events  in  S and  events  in  T which  matches  every  event  in  S to  a 
congruent  event  in  T (where  congruence  is  as  defined  In  definition  2.3). 

There  is  another  kind  of  congruence  between  states  which  is  more  subtle,  how- 

« 

ever.  This  kind  of  congruence  has  to  do  with  the  unique  Identifiers  used  for  tokens 


62. 


Chapter  2:  Massage-Passing  Semantics 


and  cells  In  this  scheme.  Ths  only  Item  of  siQnIflcance  about  these  Identifiers  Is 
that  every  instance  of  the  same  cell  (or  token)  bears  the  same  Identifier.  The  text 
of  the  Identifier  carries  no  information  In  Itself;  It  only  acquires  meaning  from  the 
relationships  between  the  various  contexts  In  which  It  appears.  Thus  these 
Identifiers  are  not  all  that  different  from  bound  variables  of  actors,  except  that 
there  is  no  one  place  (corresponding  to  the  bound  variable  list  of  an  actor)  where 
they  are  explicitly  "bound."  This  observation  motivates  us  to  consider  another  kind 
of  congruence,  one  in  which  two  states  S and  T differ  only  In  the  choice  of  unique 
Identifiers  for  cells  and  tokens.  Thus  we  state 

Definition  2.0: 

Two  states  S and  T are  congruent  if  there  exist  functions  f and  g with  the  fol- 
lowing properties: 

1 ) f is  a one-to-one  function  from  unique  Identifiers  appearing  in  S to  unique 
identifiers  appearing  In  T. 

2)  9 Is  a one-to-one  correspondence  between  events  In  S and  events  In  T with 
the  property  that  each  event  In  S corresponds  to  an  event  In  T which  Is 
congruent  (In  the  sense  of  definition  2.3)  except  that  any  unique  cell  or  to- 
ken Identifiers  occurring  In  the  former  event  have  been  replaced  In  the 
latter  event  by  their  Images  under  f. 


Section  2.6.4:  Congruence  of  States 


63. 


2.6.6:  Conclualon*  Ragarding  tha  Stata  Modal 


This  section  has  presented  a new  approach  to  the  semantics  of  the  mu  cah 
culus,  based  on  the  concept  of  the  global  state  of  the  system  rather  than  strictly 
on  the  coTMequences  of  individual  events.  This  change  In  viewpoint  was  stimulated 
by  the  dlMculty  of  handling  cells  using  the  old  model.  This,  In  turn,  seems  to  be 
because  ce//s  in  fact  contain  global  Information,  which  can  be  used  as  a medium 
for  communication  between  two  computations,  however  distant  from  each  other, 
which  share  the  same  cell.  The  necessity  of  talcing  this  view  already  portends 
some  trouble  In  Implementing  a fundamentally  centralized  concept  such  as  a cell  on 
a distributed  system.  This  will  be  dealt  with  in  more  detail  in  the  next  chapter.  A 
subsequent  section  of  this  chapter,  however,  will  be  devoted  to  arguing  that  there 
are  better  ways  of  writing  many  programs  which  we  currently  write  using  side 
effects,  especially  If  these  programs  are  to  be  well  suited  for  a distributed  system. 
Further  discussion  of  these  considerations  is  postponed  until  that  section. 

Finally,  tha  state  model  presented  here  violates  several  desiderata  that  have 
been  laid  down  for  semantic  bases  for  concurrent  computation,  as  by  Hewitt  and 
Balcer[16].  Sound  arguments  are  made  there  that  a theory  of  computation  should 
be  based  on  a local,  rather  than  a global,  view.  In  fact,  Hewitt  uses  the  analogy  to 
relativity  theory  to  argue  that  a global  state  is  a figment,  an  unobservable  and  use- 
less concept.  Along  these  lines,  an  interesting  property  of  our  state  model  should 
be  noted.  If  a state  S can  be  partitioned  into  two  states  S.|  and  S2,  end  if 
S.|  **>"  T.J  and  Sg  *=>"  Tg,  than  S =»>"  T,  where  T = T.JUT2  (assuming  no  conflict 
between  unique  Identifier  generators  used  In  processing  and  $2).  The  converse 
Is  of  course  not  true,  but  this  property  shows  a way  of  decomposing  a global 
description  of  a computation  Into  successively  more  local  views.  In  fact,  this  Is  the 
theoretical  baaia  for  our  aubdividing  the  state  S among  various  processors  and 


64. 


Chapter  2:  Mesaagm-Paaaing  Semantics 


running  the  computations  more  or  less  independently.  Of  course,  an  event  In  $2 
may  occasionally  need  to  Interact  with  an  event  In  S.|  in  order  to  produce  some 
result,  but  this  is  Just  another  way  of  saying  that  the  processors  will  have  to  conh 
municate.  Every  now  and  then,  events  may  logically  need  to  be  transferred  from 
one  processor's  responsibility  to  another’s  In  order  for  thn  system  to  continue 
operating. 

Thus  although  the  global  state  S may  Indeed  be  a figment,  the  local  states  S.| 
and  $2  may  not.  What  we  then  know  is  that  the  hypothetical  state  S always 
behaves  as  If  it  were  composed  of  some  interleaving  of  the  changes  to  S.|  and  $2- 
Many  concerns  in  program  specification  and  verification,  though,  can  be  addressed 
by  considering  only  a small  substate  of  S directly  related  to  the  problem  at  hand. 
If  this  semantic  model  is  pursued,  more  work  may  have  to  be  done  on  the 
significance  of  transferring  events  from  one  of  these  local  spheres  to  another,  but 
In  any  case  the  model  does  seem  suited  to  studying  the  local  behavior  of  small 
subcomputations  as  well  as  examining  the  global  behavior  of  entire  systems. 

2.0:  General  Synchronization  Operators 

A weakness  that  may  be  felt  in  some  applications  remains  in  the  mu  calculus 
even  after  adding  all  the  features  described  in  the  previous  sections.  This  weak- 
ness is  the  inability  to  specify  in  a general  way  any  kind  of  synchronization  opera- 
tion involving  mutual  exclusion.  It  is  not  the  purpose  of  this  section  to  deal  in  any 
deep  way  with  the  voluminous  literature  that  has  bean  devoted  to  synchronization 
problems,  nor  to  discuss  all  the  solutions  that  have  been  proposed.  Rather,  an 
attempt  will  be  made  to  show,  by  example,  that  solutions  to  such  problems  can  fit 
naturally  into  the  state  semantics  described  In  the  previous  section.  Thus  It  is 
entirety  reasonable  to  imagine  writing,  for  example,  a mu-axprasalon  that  will  serve 


Section  2.6:  General  Synchronization  Operators 


66. 


as  an  arbiter,  imposing  an  ordering  on  otherwise  uncoordinated  requests  to  access 
a resource.  Of  the  many  synchronization  primitives  that  have  been  suggested  by 
various  authors,  we  pick  Dijkstra's  se/nap/io/'es[7]  as  being  s serviceable  and  fami- 
liar set  of  primitives  for  our  demonstration. 

2.6.1:  Semaphores 

We  can  imagine  a semaphore  as  a pair  (p^.v^)  of  related  objects,  just  as 
tokens  and  cells  were  treated.  An  event  of  the  form  (Pj^C)  performs  a P operation 
(a  request  to  proceed)  on  the  semaphore  and,  when  the  conditions  are  right, 
causes  the  event  (C).  An  event  of  the  form  (v^)  will  result  in  a V operation  being 
performed  on  the  semaphore.  To  create  new  semaphores,  we  can  use  the  special 
actor  2;  an  event  such  as  (2C)  will  result  in  the  creation  of  a new  semaphore 
(p^,v^)  and  cause  the  event  will  be  initialized  so 

that  the  number  of  P operations  completed  can  never  exceed  by  more  than  one  the 
number  of  V operations  completed.  More  formally,  we  can  state  the  following 
state-model  axioms  for  semaphores. 

Axiom  68  fP  operot/on): 

If  C ■ <(Vj^),(Pj^C)>  c 8.  then  S ■■>  (S-C)u{(C)>. 

Axiom  69  (creation  of  semaphores)t 

If  f ■ (2C)  « S,  then  S ==>  (8-{£))u{(Cp^Vjj,),(Vj^)},  where  x Is  a new  identifier 
that  does  not  appear  Ir  any  event  in  S. 


66. 


Chapter  2:  Mesaage-Pasaing  Semantics 


Note  that  no  explicit  axiom  la  needed  for  the  V operation. 

The  operation  of  these  two  new  axioms  Is  fairly  simple.  The  presence  of  an 
event  (v^)  in  a state  S signifies  that  a P operation  in  that  state  can  succeed. 
Thus  the  value  of  the  semaphore  (p^,v^)  In  a state  S is  simply  the  number  of 
events  (v^^)  In  S minus  the  number  of  events  In  S of  the  form  (Pj^C).  When  a P 
operation  occurs,  It  annihilates  an  event  (v^^),  preventing  other  P operations  from 
completing  (unless  other  events  (v^^)  remain).  When  mutual  exclusion  is  no  longer 
necessary,  a V operation  will  restore  the  event  (v^)  so  that  another  P operation 
may  proceed. 

In  order  to  meet  the  specifications  described  informally  earlier,  the  creation  of 
a semaphore  must  entail  the  creation  of  an  initial  event  (v^^)  so  that  one  P opera* 
tion  can  complete  before  any  V operation  is  begun.  Axiom  B9  takes  care  of  this 
requirement. 

2.6.2:  Construction  of  an  Arbiter 

As  an  example  of  how  semaphores  may  be  used  in  the  mu  calculus,  we  con- 
sider the  construction  of  an  arbiter  actor  «.  The  behavior  we  desire  is  that  an 
event  of  the  form  (afC)  cause  an  event  (CF*),  where  F*  is  a version  of  the  actor  F 
enclosed  In  an  arbiter;  In  other  words,  an  event  (F*xy)  will  cause  an  event 
(FXV")-where  y"  is  a continuation  derived  from  y-but  only  if  no  other  such  event 
has  been  caused  and  not  yet  resulted  in  F returning  an  answer  to  its  continuation. 
Thinking  of  F as  a function,  we  see  that  access  to  F will  be  serlt^lzed:  no  mes- 
sage to  F*  will  be  passed  along  to  F If  any  invocation  of  F Is  currently  active.  This 
kind  of  behavior  would  be  desirable  if  F were,  for  example,  a function  that  allo- 
cated seats  on  a particular  flight  In  an  airline  reservations  system. 

An  arbiter  with  the  desired  properties  can  be  defined  as 


Section  2.6.2:  Construction  of  an  Arbiter 


67. 


• i iifc2iipv.citxy.pii.fxOiz.(yz)(v)) 


As  an  example,  let  us  imagine  applying  this  actor  to  some  actor  F.  Our  initial  state 
is  then 


So  - iimFOy 

This  will  cause  these  subsequent  states: 

■ {C8iM»v.C(ixy.piiJ=^x(|iz.(yz)(v)))} 

®2  “ {(0»Pv.Csxy.pa^x(<iz.(yz)(v)))p^vp,(wy)> 

S3  ■ {(Csxy.p^.^xOiz.(yz)(vp)),(vp} 

- {(C/^").(vp} 

Let  us  suppose  that  due  to  machinations  inside  C,  S3  eventually  causes 

Among  the  states  that  may  be  caused  by  this  is 

•e  ■ <(P/P-^>fl(w  (VlZ)(V/))).(p,s.^X20.z.(y2*Xv^))).(vp> 

which,  by  cuciom  B8,  can  lead  to  either  of 

S3  - <0i^X,0iz.(y,z)(v,))),(p^,^X20«-(y22Kv,)))} 

S3’  - {(PyP^x,0iz.(y,z)(vp)),0i.fX2CMz.(y2z)(vp))> 

Note  that.  In  either  of  these,  the  event  that  still  has  p^  as  the  recaivar  will  remain 
untouched  In  all  future  states  until  an  event  (v^  is  added  back.  Sinca  the  situa- 
tion is  symmetrical,  let  us  pursue  the  consequences  of  S3. 

•j  • {(^x,(M.(y,2)(wp)),(p^.^X2(Az.(y2zXv,)))> 


68. 


Chapter  2:  Moeeage-Paaslng  Semantics 


Sq  . {(Oi2.(yiZ)(vp)zp.(p/M.fX2(>iz.(y22)(Vy)))} 


assuming  F returns  the  result  to  Its  continuation  when  sent  the  value  X^. 

8q  » {(yiZ,).(vp,(P^./^X20*2.(y2z)(Vy)))} 

Now  that  the  invocation  of  F resulting  from  (f*X^y^)  is  complete,  the  event  (v^)  Is 
added  again  so  that  the  pending  request  can  proceed  in  a similar  fashion.  At  no 
time,  however,  was  it  possible  for  both  to  be  active  at  once. 

It  Is  significant  that  in  the  use  of  the  arbiter  «,  the  argument  F must  be  not 
just  an  actor  but  a function^M  actor  which  indicates  that  it  is  "done"  by  following 
the  convention  of  returning  a single  value  to  Its  continuation  (similar  requirements 
attend  the  use  of  the  parallelism  actor  w discussed  earlier).  If  the  actor  F never 
returns  an  answer  to  Its  continuation,  then  the  semaphore  will  never  be  released, 
and  no  other  request  to  F will  ever  be  allowed  to  proceed.  If  F returns  more  than 
one  answer  to  its  continuation,  as  In  the  "function" 

F = mxc.(c3)(c4) 

then  turo  V operations  will  be  performed  on  the  semaphore,  which  will  thereafter 
operate  incorrectly  (allowing  tivo  or  more  Invocations  of  F to  be  active  at  the  same 
time).  The  requirement  that  the  object  subject  to  arbitration  obey  the  ground  rules 
for  a function  is  not  unique  to  this  scheme;  it  applies  also,  for  example,  to  Hewitt's 
serializer  construct[14]. 


Section  2.0.2:  Construction  of  an  Arbiter 


60. 


2.6.3:  Conclusions  Rogarding  Samaphoras 


r 

I 

i! 

I 


u 

i 


I 

i 

I 


I 

i 


After  a brief  discussion  of  the  problem  of  mutual  exclusion,  semaphores  were 
Introduced  ss  a way  of  Implementing  mutual  exclusion  In  the  mu  calculus.  The  goal 
was  not  to  shed  any  new  light  on  synchronization  problems,  but  simply  to  demon- 
strate the  possibility  of  including  mechanisms  for  dealing  in  a natural  way  with 
mutual  exclusion  in  the  mu  calculus.  A simple  application  of  semaphores,  the  con- 
struction of  an  arbiter  to  serialize  access  to  an  actor,  was  then  explained,  to  show 
how  semaphores  might  be  used  In  the  mu  calculus.  Finally,  some  aspects  of  the 
arbiter,  Including  restrictions  on  the  class  of  actors  to  which  it  can  be  applied, 
ware  discussed. 

Not  treated  in  this  brief  overview  were  several  of  the  thorny  synchronization 
problems  described  in  the  literature.  Various  problems  have  been  proposed  which 
cannot  easMy  be  solved  using  semaphores;  however,  there  Is  no  reason  to  believe 
that  it  would  be  dHllcult  to  include  other  synchronization  primitives  in  the  mu  cal- 
culus. Another  unexplored  dimension  concerns  fairness  of  scheduling  in  synchroni- 
zation operators.  Thens  are  solutions,  for  example,  to  the  readers-writers  prob- 
lem[11]  which  are  partially  correct-no  violation  of  the  desired  mutual  exclusion  can 
occur.  Nevertheless,  It  Is  possible  for  an  unfortunately-timed  sequence  of,  say, 
road  requests,  to  prevent  a write  request  from  ever  completing,  if  the  semaphores 
used  are  "unfair.''  A fair  semaphore,  on  the  other  hand,  would  honor  requests  in 
the  order  of  their  arrival;  thus  any  request  is  assured  of  being  processed  after  a 
finite  length  of  time. 

The  natural  question  to  ask  is,  are  the  semaphores  described  by  axioms  B8 
and  B9  fair  or  unfair  semaphores?  One  answer  Is  that  It  Is  impossible  to  tell.  The 
state  model  gives,  fbr  some  Initial  state,  a set  of  later  states  that  It  could  cause. 
It  does  not  say  which  of  these  possibilities  would  be  produced  by  an  actual 


60. 


Chapter  2:  Message-Passing  Semsntics 


Implementation.  The  formulation  given  for  semaphores  does  not  exclude  the  possi- 
bility of  the  semaphores  being  fair— all  possible  outcomes  from  a fair  semaphore  are 
represented.  However,  If  we  Interpret  our  state  model  as  representing  only  those 
Implementations  which  could  conceivably  generate  any  of  the  consequences 
predicted  by  the  state  model,  then  we  must  answer  that  the  semaphores  described 
are  unfair,  since  a given  initial  state  may  cause  some  states  which  could  only  be 
reached  In  an  Implementation  that  had  unfair  semaphores. 

Interestingly  enough,  there  Is  no  simple  way  to  fix  this  in  our  model.  One  could 
modify  the  semaphore  axioms  so  that  the  "oldest"  P request  in  a state  Is  always 
the  next  one  processed,  but  this  would  be  only  a partial  solution.  That  Is  because 
there  is  no  guarantee  of  fair  scheduling  in  our  model,  no  guarantee  that  a state 
cannot  cause  an  Infinite  sequence  of  other  states,  without  some  event  In  the  origi- 
nal state  ever  being  processed.  Once  again,  our  model  does  not  exclude  fair 
scheduling,  but  It  does  not  specify  It,  and  without  a guarantee  of  fair  scheduling  a 
fair  semaphore  is  meaningless.  The  conclusion  to  be  drawn  from  all  of  this,  then,  Is 
that  our  state  model  is  useful  for  describing  all  possible  results  of  a computation, 
but  less  useful  for  distinguishing  the  results  that  some  particular  strategy,  such  as 
fair  scheduling,  would  allow,  from  the  results  possible  under  some  scheme  offering 
fewer  guarantees. 

2.7:  Coding  Imperative  Constructs 

So  far,  our  attention  has  been  focused  primarily  on  developing  various  mechan- 
isms for  the  mu  calculus  and  showing  their  relationship  to  languages  with  a mainly 
applicative  flavor,  such  as  LISP  and  the  lambda  calculus.  Grudging  acknowledgment 
has  been  made  in  the  previous  two  sections  of  the  existence  of  side  effects  and 
the  desirability  of  being  able  to  model  them,  but  no  discussion  of  Imperative  styles 


Section  2.7:  Coding  Imperative  Constructs 


SI. 


of  programming  has  boon  conducted.  This  has  been  due,  first,  to  the  fact  that 
applicative  constructs  are  usually  easier  to  deal  with  formally,  and  second,  to  a 
prejudice  that  side  effects  are  difficult  to  handle  In  distributed  systems  and  there- 
fore ere  best  avoided.  The  prejudice  remains;  this  author  Is  convinced  that 
extensive  use  of  side  effects  (such  as  cells  and  semaphores)  will  make  it  much 
more  difficult  for  a distributed  system  of  the  kind  envisioned  to  process  a program 
at  top  efficiency.  Since  so  much  of  the  world  programs  in  a basically  imperative 
style,  however  (using  languages  such  as  FORTRAN,  COBOL,  PL/1,  ALGOL,  PASCAL, 
etc.),  the  goal  of  this  section  is  to  argue  that  many  activities  of  imperative  pro- 
grams (such  as  assignment  statements)  that  are  ordinarily  thought  of  as  being 
accompanied  by  side  effects,  can  be  expressed  in  a form  that  is  quite  free  of  side 
effects  and  thus  much  more  suitable  for  a distributed  system. 

The  mechanics  of  translating  programs  written  In  an  Imperative  style  (including 
assignments,  go-tos,  conditionals,  looping,  and  parallelism)  into  a continuation-  or 
message-passing  style  have  been  explored  in  great  depth  by  Sussman  and 
Steela[2Q,dO,33].  Their  translation  includes  the  removal  of  side  effects  caused  by 
most  assignment  statements  and  Is  quite  applicable  to  producing  mu-calculus 
expressiona  as  output.  It  would  be  repetitive  and  inappropriate  to  go  into  such 
detail  hare,  but  some  aspects  of  our  view  of  parallelism  specifically  and  message 
passing  generally  warrant  some  further  discussion. 

The  Sussman-Steele  scheme  works  by  translating  a sequence  of  statements 
into  a continuatloir-styla  program,  where  the  first  statement  In  a sequence  becomes 
an  event  containing  the  translation  of  the  remaining  statements  inside  a continua- 
tion actor  which  Is  port  of  that  first  event.  Many  side  effects,  such  as  those  asso- 
ciated vrith  most  assignments,  can  than  be  eliminated  by  viewing  local  variables  not 
as  cells  Into  which  different  values  may  be  put  over  time,  but  as  parameters  to  the 


62. 


Chapter  2:  Meeaage-Passing  Semantics 


r — " 

continuation.  Thus  the  translation  of  statement  n receives  as  parameters  the 
values  of  all  local  variables,  plus  a continuation  actor  containing  the  translations  of 
all  statements  following  n.  If  the  execution  of  statement  n has  no  effect  on  the 
value  of  any  local  variable,  It  will  eventually  pass  to  the  continuation  exactly  the 
set  of  local  variable  values  It  received.  If  statement  n Is,  for  example,  an  assign- 
ment to  local  variable  x,  then  no  side  effect  is  required.  Instead,  the  set  of  local 
variable  values  passed  to  the  continuation  will  have  the  new  value  for  x in  place  of 
the  old  one.  Sussman  and  Steele  accept  that  not  all  assignments  can  be,  or  should 
be,  treated  In  this  fashion— some  assignments  are  to  "global”  variables:  for  exam- 
ple, directories  In  a file  system.  This  is  where  the  concept  of  cells  should  fit  in.  It 
is  nevertheless  true  that  the  Sussman-Steele  procedure  can  be  used  to  translate 
ordinary  Imperative  programs  Into  a form  which  Is  largely  free  of  side  effects. 

Assuming  that  sensitivity  to  side  effects  has  been  removed  from  some  section 
of  a program.  It  can  then  be  adapted  In  various  ways  to  be  even  more  suitable  for 
execution  on  our  kind  of  distributed  system,  especially  with  appropriate  choices  In 
the  design  of  the  original  Imperative  language.  To  echo  many  other  research- 
ers[2,12].  It  should  be  possible  to  engage  In  parallel  evaluation  of  subexpressions, 
for  example,  of  A and  B In  the  expression  "A-fS,"  or  even  in  the  case  where  A and 
B are  procedure  arguments,  as  in  "f(A,fi)."  Our  token  mechanism  seems  a quite 

S 

serviceable  way  of  accomplishing  this.  Arrather  way  of  increasing  parallelism  is 
through  multiple  assignment  statements  such  as  "a,b,c  A,B,C"  where  A,  B,  and  C 
should  an  be  able  to  be  evaluated  In  parallel.  This  fits  In  very  naturally  with  the 
Imperative-  to  continuatior^style  translation  described  above. 

A tMrd  possible  way  of  Increasing  the  power  of  an  Imperative  language  to 
express  paraHellsm  might  be  by  a conatruct  such  a.s  "S&T,"  where  S and  7 are 
stalumenls.  Ihw  semantics  of  this  statement  would  be  that  S and  T may  be 


Section  2.7 1 Coding  Imperative  Constructs 


63. 


•xacutad  In  parallal,  and  that  the  following  statement  may  not  begin  execution  until 
both  S and  T have  flniahad.  The  "S&T"  statement  would  allow  the  programmer  great 
freedom  to  expresa  a wide  variety  of  permissible  orderings  of  computations;  this 
flexibility  could  then  be  used  to  maximum  advantage  by  the  system.  It  Is  not  hard 
to  see  how  the  semantics  of  this  statement  could  he  constructed  using  tokens, 
although  problems  do  arise  if  both  S and  T modify  local  variables  and  those  changes 
must  be  merged  at  the  end.  More  serious  problems  arise  if  S and  T use  assign- 
ments to  local  variables  for  communication;  in  this  instance  cells  would  probably 
have  to  be  used  to  implement  those  variables. 

Even  more  exciting  possibilities  for  generating  large  amounts  of  parallelism  are 
present  in  the  looping  constructs  of  imperative  languages.  In  many  cases  such 
loops  are  used  simply  to  express  the  notion  that  some  operation  is  to  be  performed 
on  every  element  of  some  data  structure.  If  the  order  Is  unimportant,  which  is  fre- 
quently the  case,  a looping  construct  allowing  all  activations  of  the  loop  body  to 
proceed  In  parallel,  rather  than  sequentially,  could  result  In  large  amounts  of  useful 
parallelism.  Imagine,  for  example,  a compiler  simultaneously  performing  type  check- 
ing or  even  translation  of  every  statement  in  a block.  An  organization  that  allowed 
all  those  activities  to  proceed  in  parallel  might  well  be  able  to  make  use  of  several 
processors  to  speed  the  compilation. 

The  discussion  in  this  section  is  necessarily  of  a vary  speculative  nature, 
since  the  rigorous  detail  beneath  it,  if  developed  carefully,  would  expand  In  size 
out  of  proportion  to  its  place  In  this  thesis.  Nevertheless,  it  is  hoped  that  some 
reasons  have  been  presented  to  be  optimistic  that  the  mechanisms  for  parallelism 
that  have  been  developed  in  this  chapter  can  be  used  to  generate  large  numbers 
of  parallel  activities  performing  programming  tasks  of  general  interest.  In  the  next 
chapter,  we  will  be  banking  on  this  property  as  the  key  to  being  able  to  apply 


64. 


Chapter  2:  Message-Passing  Semantics 


multiple  processors  to  such  tasks.  The  theory  will  be  that,  even  If  at  any  time  the 
majority  of  parallel  activities  are  blocked,  awaiting  some  Information  from  a distant 
site,  there  will  still  be  several  activities  at  any  time  that  are  not  blocked,  and  thus 
several  processors  can  be  kept  busy. 

\ 

2.8:  Summary 

This  chapter  has  outlined  the  mu  calculus,  a formalism  for  studying  message- 
passing computation.  The  development  began  with  the  pure  mu  calculus,  which  is 
very  much  like  a restriction  of  the  lambda  calculus,  and  proceeded  through  the 
addition  of  tokens,  cells,  and  semaphores.  Along  the  way,  correspondences 
between  the  applicative,  imperative,  and  message-passing  styles  of  programming 
were  discussed.  Also  explored  were  applications  of  the  mu  calculus  to  implement- 
ing recursion,  list  structures,  self-referential  structures,  arbiters,  and  other  corr- 
structs. 

In  spite  of  these  "applications,"  the  mu  calculus  is  not  billed  as  a programming 
language.  0thers[15,19]  have  developed  programming  languages  based  on  mes- 
sage passing.  In  contrast,  this  chapter  has  sought  to  develop  a formalism,  bereft 
of  all  embellishments,  for  better  understanding  the  essence  of  message  passing. 
For  this  reason,  the  mu  calculus  has  been  made  as  spare  a language  as  possible. 
If  it  is  to  be  used  directly  at  all,  one  can  only  Imagine  It  as  the  "machine  language" 
for  a distributed  system,  perhaps  the  one  outlined  in  the  remainder  of  this  thesis. 
Actual  programming  would  certainly  be  done  In  a highly  sugared  version  of  the  mu 
calculus  or  in  a different  kind  of  language  altogether,  which  would  then  be 
translated  to  a form  resembling  the  mu  calculus.  The  various  translation  rules 
presented  suggest  some  ways  In  which  this  desugaring  or  translation  might  bo 

Section  2.8:  Summary 


don*. 


As  an  example  of  "the  essence  of  message  passing,"  the  mu  calculus  invites 
controversy.  Several  aspects  of  message  passing  that  are  manifest  In,  say, 
Hewitt’s  PLASMA[15],  are  absent  from  the  mu  calculus,  even  disregarding  the 
•xtenslv*  syntactic  sugaring  that  goes  Into  making  PLASMA  a usable  language.  For 
example,  PLASMA  Incorporates  the  notion  of  a process,  whereas  the  mu  calculus 
does  not.  The  advantages  of  recognizing  the  existence  of  processes  include  the 
ability  to  treat  a process  as  a object,  and  consequently  to  perform  operations  on 
it.  In  a practical  system,  this  provides  a framework  for  keeping  various  useful 
pieces  of  information,  such  as  a process’s  user  ID,  priority,  or  whatever.  It  also 
permits  a convenient  mechanism  for  identifying  and  killing  runaway  computations. 
On  the  other  hand,  the  concept  of  processes  is  one  that,  a.s  we  have  seen  in  this 
chapter,  we  can  do  without,  simplifying  our  formalism.  Whether  the  simplification 
removes  something  essential  from  the  formalism,  as  far  as  its  usefulness  In  model- 
ing real  situations  is  concerned,  remains  an  open  question. 

Also  left  out  of  the  mu  calculus  is  another  part  of  the  actor  ideology.  As  arti- 
culated by  Hewitt,  this  ideology  holds  that  all  transactions,  including  message 
transmissions,  are  mediated  by  actors.  Thus  the  causation  of  an  event  entails  the 
arrival  of  a messenger  actor  in  the  vicinity  of  a target  actor.  This  messenger  actor 
I*  a package  containing  the  objects  actually  being  sent  to  the  target.  The  target 
may  then  extract  these  objects  by  sending  messages  to  the  messenger,  which  may 
In  turn  send  messages  to  these  messages,  and  so  on.  (It  should  be  pointed  out 
that  Hewitt  makes  good  use  of  the  packaging  of  messages  into  messengers  to 
Implement  forms  of  keyword-based,  as  well  as  position-based,  conventions  for  sup- 
plying arguments  to  actors.)  It  Is  recognized  that  at  soma  point  this  potentially 
infinite  regress  must  be  terminated,  so  It  is  stipulated  that  at  some  point  a 


00. 


Chapter  2:  Message-Passing  Semantics 


message  may  be  examined  by  accessing  its  innards  directly,  rather  than  by  send- 
ing it  messages  and  awaiting  its  replies.  The  mu  calculus  eliminates  this  regress 
by  explicitly  providing  a basis  for  It.  In  the  mu  calculus,  actor  arguments  are  not 
packaged  into  messenger  actors,  but  instead  are  immediately  available.  The  user 
may,  of  course,  build  on  top  of  this  as  many  layers  of  message-transmission  proto- 
col as  desired. 

This  discussion  of  messengers  is  related  to  another  Important  part  of  the  actor 
ideology.  This  is  the  dictum  that,  as  far  as  Is  practicable,  objects  in  an  actor  sys- 
tem should  interact  with  each  other  by  exchanging  messages,  rather  than  directly 
accessing  each  others’  innards.  Thus,  for  example,  an  actor  which  computes  the 
negative  of  its  argument  should  send  a message  to  its  argument  asking  it  what  its 
negative  is,  rather  than  trying  to  examine  the  argument  and  compute  the  negative 
that  way.  Adherence  to  this  dictum  can  Increase  the  Independence  of  operators 
from  the  representations  used  for  data,  and  also  facilitate  the  extension  of  opera- 
tors to  new  data  types.  Although  the  arithmetic  operators  of  the  mu  calculus  have 
not  been  specified  to  work  this  way,  this  has  been  because  it  was  desired  to  main- 
tain some  similarity  to  the  lambda  calculus  and,  once  again,  to  provide  a basls- 
somewhere  there  has  to  be  some  object  which  knows  how  to  perform  basic  arith- 
metic operations  directly.  The  design  of  the  mu-calculus  primitives  was  certainly 
not  an  attempt  to  take  a position  on  this  aspect  of  the  actors  approach,  which  has 
much  to  recommend  it.  As  before,  the  user  is  free  to  add  levels  of  protocol  on  top 
of  the  bare  mu  calculus  to  achieve  this  end.  Exploration  of  these  issues  Is  not  a 
major  goal  of  this  thesis,  however. 

The  mu  calculus  is  thus  designed  to  serve  two  purposes.  One  is  to  gain  a 
better  understanding. of  the  essence  of  the  message-passing  style  of  computation 
by  excising  all  that  seems  Inessential.  The  other  is  to  serve  as  a semantic  basis 


Section  2.8:  Summary 


67. 


06. 


Chapter  2:  Meaeage*Pasalno  Semantios 


Chapter  3:  Implementations 


The  goal  of  this  research  has  been  not  only  to  specify  a language  (the  mu  cat* 
cuius)  with  the  potential  to  make  good  use  of  a distributed  system,  but  to  demon- 
strate that  potential  by  describing  an  appropriate  implementation.  This  chapter  will 
therefore  be  devoted  to  outlining  an  implementation  of  the  mu  calculus. 

3.1:  The  Basic  Approach 
3.1.1:  Semantic  Structure 

The  Implementation  we  choose  will  be  based  on  the  concept  that  there  exist  a 
variety  of  objects  scattered  around  a multiple-processor  system.  These  objects 
may  move  from  one  site  to  another  as  the  needs  of  the  system  require,  and,  under 
suitable  circumstances,  multiple  copies  of  an  object  may  be  maintained.  Objects 
are  not  monolithic,  but  contain  structure,  depending  on  the  type  of  data  the  object 
Is  supposed  to  represent.  Among  the  kinds  of  components  an  object  may  have  are 
references  to  other  objects;  thus  the  system  will  have  the  capability  to  support 
tree-llke  or  even  cyclical  organizations  of  data.  The  semantic  properties  of  object- 
reference  systems  such  as  this  have  been  studied  at  some  length[4]  along  with 
possible  implementations  on  centralized  computer  systems. 

In  our  case,  objects  might  be  actors,  cells,  tokens,  numbers,  or  other  kinds, 
perhaps  specialized  to  a particular  application  (a  graphics  system,  for  example, 
might  include  three-dimensional  vectors  and  3x3  matrices  as  primitive  object  types, 
along  with  the  applicable  primitive  operators).  Since  many  of  these  kinds  of 
objects  (e.g.,  actors)  may  have  a great  deal  of  structure,  the  ability  of  objects  to 
naturally  ntfurunco  other  ol>Jocts  Is  crucial. 

In  addition  to  objects,  our  system  must  have  some  way  of  representing  events. 


Section  3.1.1:  Semantic  Structure 


68. 


f 


Events,  however,  are  never  more  than  a transitory  presence  in  our  system-they 
serve  only  to  mark  the  current  place  in  a computation  until  that  computation 
advances  one  step,  leadino  to  the  creation  of  new  events  and  the  demise  of  the 
old  one.  Thus  it  is  profitable  to  treat  events  somewhat  specially. 

It  Is  useful  to  think  of  a system-wide  event  list  containing  ail  events  in  the  sys- 
tem that  have  been  created  but  not  yet  processed.  This  global  event  list 
corresponds  closely  to  the  global  system  state  described  in  the  previous  chapter. 
In  practice,  this  system-wide  event  list  will  be  distributed  (partitioned)  into  several 
smaller  event  lists,  one  maintained  by  each  processor  in  the  system. 

In  a more  traditional  system,  the  function  served  by  the  event  list  might  be 
served  instead  by  a table  of  active  processes.  Typically,  each  process  in  such  a 
table  would  contain  various  pieces  of  state  information  about  the  process,  such  as 
saved  register  contents,  stack  pointer,  process  stack,  location  of  the  core  Image 
for  the  process,  etc.  In  the  message-passing  system  described  here,  all  this  infor- 
mation Is  contained  In  the  selection  of  objects  referenced  by  an  event.  There  are 
no  processes  per  se-Just  events  waiting  to  be  acted  upon. 

3.1.2:  Physical  Structure 

Any  student  of  distributed  systems  quickly  comes  face  to  face  with  a whole 
variety  of  dHTerent  possible  network  structures:  rings,  Ethernets,  shared  buses, 
hierarchical  organizatiorts,  store-and-forward  networks,  etc.  Many  of  the  properties 
of  these  are  similar,  but  there  are  differences  In  capabilities  relating  to  the  ease  of 
knowing  whether  a message  has  been  received,  bandwidth,  response  to  contention 
for  access  to  the  network,  routing  of  messsges,  and  other  matters.  As  much  as 
possible.  It  Is  the  intent  of  this  research  to  avoid  becoming  committed  to  the  nai^ 
row  technological  characteristics  of  any  one  type  of  network.  However,  for 


70. 


Chapter  3:  Implementations 


concreteness,  It  will  be  helpful  to  make  certain  assumptions.  Furthermore,  the  algo- 
rithms presented  depend  on  s certain  logical  organization  of  processors  which  may 
be  more  easily  achievable  with  some  physical  organizations  than  with  others.  This 
section  seeks  to  outline  a physical  organization  which  is  very  compatible  with  the 
required  logical  organization,  and  make  some  preliminary  comments  on  which 
aspects  of  its  design  are  essential  for  the  proper  functioning  of  the  algorithms  to 
be  presented  below.  Additional  comments  on  this  topic  will  appear  throughout  this 
chapter  as  the  relevant  concepts  are  discussed. 

The  basic  logical  structure  assumed  by  our  Implementation  will  bo  a collection 
of  processors,  each  with  a private  local  memory  (I.e.,  no  sharing  of  memory 
between  processors  is  required),  each  connected  to  a small  number  of  other  pro- 
cessors which  are  Its  neighbors.  The  connections  are  bidirectional  and  symmetri- 
cal; thus  if  A Is  a neighbor  of  B,  then  B is  a neighbor  of  A.  Such  an  organization  of 
processors  Is  thus  equivalent  to  an  undirected  graph  (which  may  or  may  not  contain 
cycles)  in  which  only  a few  arcs  emanate  from  each  node.  "Few"  here  is  a rela- 
tive term;  Its  use  is  due  to  the  fact  that  the  overhead  incurred  by  a processor  will 
increase  if  that  processor  accumulates  more  neighbors.  Thus  it  might  well  be 
appropriate  for  higher-capacity  machines  or  machines  with  more  of  a commitment  to 
serve  the  network  (rather  than,  say,  their  owners)  to  have  a greater  number  of 
neighbors.  A variety  of  topologies  are  consistent  with  these  general  restrictions 
(some  are  shown  in  Figure  3.1). 

In  addition  to  the  specifications  given  above,  each  processor  will  be  required 
to  he^'M  a processor  ID  unique  to  the  entire  system  (or  some  other  way  of  generat- 
ing names  guaranteed  to  be  unique  throughout  the  system).  It  is  not  necessary, 
hownver,  that  all  processors  bo  Iduntical,  or  that  every  processor  use  the  samo 
external  data  representation  In  communicating  with  Its  neighbors.  In  general,  a 


Section  3.1.2:  Physical  Structure 


71. 


FIgtirtt  3.1 : Some  possible  network  topologies 


different  representation  oould  be  ueed  for  each  link,  subject  to  a few  provisoa. 
First,  obviously  the  processors  at  each  end  of  the  Hnk  must  understand  the 
representation  used  on  that  link.  Second,  the  repreeentation  must  be  eufficiently 
general  that  any  kind  of  object  that  may  exist  in  the  system  can  be  passed  over 
the  link.  Third,  the  link  must  support  transmission  of  the  systaai-wide  unique  names 
of  objects  In  such  a fashion  that  the  name  of  an  object  wiN  always  be  recognized, 
even  If  It  arrives  at  a processor  via  two  different  routes.  Fourth,  although  In  theory 


72. 


the  protocols  used  on  different  links  may  differ,  they  are  all  likely  to  have  the  com- 
mon features  described  In  this  chapter. 

The  physical  topology  of  the  system  need  not  follow  the  logical  topology 
described  above.  The  logical  topology  In  effect  constrains  the  paths  over  which 
Information  may  travel;  any  physical  topology  which  permits  Information  flow  over 
these  channels  may  form  the  basis  for  an  acceptable  Implementation.  For  example, 
an  Ethernet  (on  which  every  processor  can  communicate  directly  with  every  other 
processor)  could  be  made  to  support  the  logical  topology  described  above  either  by 
declaring  every  processor  to  be  a neighbor  of  every  other  (which  might  however 
Impose  a large  amount  of  overhead  on  each),  or  by  choosing  for  each  processor  a 
set  of  logical  neighbors,  and  constraining  the  communication  patterns  on  the  net  so 
that  no  processor  ever  sends  a message  to  another  processor  that  is  not  Its  neigh- 
bor. This  approach  makes  less  than  full  use  of  the  capabilities  of  the  Ethernet, 
since  a processor  might  occasionally  be  forced  to  follow  a rather  tortuous  path  to 
communicate  with  another  on  the  same  net,  but  might  reduce  the  overhead  within 
the  processors  for  object  management,  as  we  shall  see. 

In  any  case,  the  scheme  presented  here  was  certainly  not  designed  for  Ether- 
nets, but  rather  for  physical  networks  with  properties  closely  matched  to  the  logi- 
cal network.  Such  networks  have  several  advantages: 

a expansibility.  The  network  can  be  expanded  to  a very  large  size  at  little 
marginal  cost.  The  space  allocated  for  unique  processor  ID’s  does  grow,  but 
only  logarithmically.  Otherwise,  expansion  presumably  Involves  simply  hook- 
ing up  new  processors  to  the  edges  of  the  network,  and  has  only  a very 
local  impact. 

M bandwidth.  For  many  topologies,  there  are  potentially  a large  number  of 


Section  3.1.2:  Physical  Structure 


73. 


communication  paths  between  any  two  points  in  the  network;  no  central 
Ether  or  other  medium  serves  as  a bottleneck. 

a reliability.  Even  a catastrophic  hardware  failure  at  some  node  is  likely  to 
affect  only  a limited  number  of  other  nodes  (its  neighbors).  In  systems  with 
a central  medium,  there  are  components  whose  failure  will  stop  all  communi- 
cation on  the  network.  Of  course,  reliability  is  also  strongly  influenced  by 
the  software  system’s  ability  to  carry  on  in  the  face  of  failures;  admittedly, 
this  thesis  does  not  address  this  problem  very  thoroughly. 

■ flexibility.  Many  different  kinds  of  processor  and  link  technology  can  in  prin- 
ciple coexist  in  the  system,  allowing  considerable  freedom  in  picking  the 
lowest-cost  option  for  the  performance  desired  at  each  point. 

■ other  advantages  that  will  become  evident  when  the  object-management 
algorithms  are  discussed. 

There  are  of  course  disadvantages  to  this  kind  of  organization  also.  Chief 
among  thesa  is  the  need  for  extra  processors  to  become  Involved  in  transactions 
between  processors  which  are  not  neighbors,  with  the  attendant  overhead  and 
delay.  It  is  hoped  that  our  scheme  will  tend  to  minimize  the  need  for  this  kind  of 
transaction,  but  it  remains  to  be  seen  whether  the  advantages  cited  above  are 
worth  this  drawback. 


74. 


Chapter  3:  Implementations 


r 

3.2:  Overview  of  System  Operation 
3.2.1:  Objects  and  Object  References 

As  was  mentioned  in  the  previous  section,  we  desire  to  allow  objects  to  refer 
to  other  objects;  by  using  such  a reference,  an  object  may  logically  "contain" 
another  object  without  physically  containing  It.  These  references  will  also  allow 
sharing  of  objects,  as  well  as  Increasing  the  structural  modularity  of  the  data,  as 
has  already  been  discussed.  In  order  to  Implement  this,  we  introduce  the  concept 
of  an  object  reference[4],  which  is  an  entity  that  uniquely  identifies  a particular 
object  without  necessarily  containing  explicitly  all  the  information  in  the  object.  In 
machine-language  programming,  a pointer  can  be  such  a reference:  It  serves  to 
uniquely  identify  the  area  in  storage  containing  the  desired  information,  yet  inspec- 
tion of  the  bits  of  the  pointer  will  not  yield  that  Information.  Although  It  is  not 
necessary,  we  may  Imagine  that  all  object  references  have  the  same  length;  thus 
a reference  to  any  object  will  in  principle  fit  Into  the  place  of  a reference  to  any 
other  object. 

The  first  tradeoff  encountered  in  designing  a system  using  object  references 
Involves  deciding  what  to  Include  in  the  reference.  At  one  extreme,  an  object 
reference  might  be  Just  a unique  identifier  for  an  object  (like  a pointer  to  a memory 
address).  At  the  other  extreme,  large  amounts  of  information  about  the  object 
might  be  included  (type  code,  length,  hash  code,  etc.-naturally,  such  information 
can  only  be  kept  in  a reference  if  it  remains  valid  through  all  operations  that  may 
be  performed  on  the  object).  The  purpose  of  including  this  additional  Information 
would  be  to  enable  some  operations  Involving  the  object  (such  as  determining  Its 
type)  to  be  performed  using  only  the  reference,  without  needing  access  to  the  full  \ 

I 
1 

text  of  the  object.  However,  the  Inclusion  of  extra  Information  Increases  the 
amount  of  space  required  for  an  object  reference.  In  this  space-time  tradeoff,  the  ] 

j 

I 

i 

i 


Section  3.2.1:  Objects  and  Object  References 


76. 


space  required  for  ttie  extra  Information  in  the  reference  will  be  more  worthwhile  if 
the  amount  of  time  required  to  access  the  object  from  a reference  is  longer,  in  a 
distributed  system,  where  the  object  may  even  need  to  be  fetched  from  another 
site,  It  seems  well  worthwhile  to  Include  extra  Information  In  object  references. 

In  any  case,  we  see  that  an  object  in  general  will  have  two  components:  a 
reference  (which  if  the  object  is  shared  may  exist  ki  several  places)  and  s text 
which  contains  all  information  about  the  object  not  contained  In  the  reference  (sim- 
ple objects,  such  as  numbers,  may  not  have  any  text,  if  all  information  associated 
with  the  object  can  fit  into  the  reference).  There  is  conceptually  only  one  copy  of 
the  text,  shared  by  all  possessors  of  a reference  to  an  object.  Obviously,  if  side 
efFects  can  be  performed  on  an  object  (such  an  object  can  be  termed  mut- 
a6/e[20]),  all  such  effects  must  be  accomplished  by  modifying  the  text,  which  is 
the  shared  portion  of  the  object.  If  an  object  cannot  be  changed  (an  Immutable 
object),  then  many  copies  of  its  text  may  be  made,  so  that  a copy  may  be  kept 
near  each  site  where  It  might  be  needed.  Even  a mutable  object  may  have  multiple 
copies  of  Its  text  made  under  appropriate  circumstances,  a topic  to  which  we  shall 
return. 

For  the  sake  of  concreteness,  we  may  imagine  an  object  reference  to  contain 
the  following  fields: 

K a unique  identifier.  This  field  is  determined  when  the  object  is  created  and 
Is  different  for  each  distinct  object  in  the  entire  system.  It  might,  for  exam- 
ple, be  constructed  by  concatenating  the  unique  processor  ID  of  the  proces- 
sor where  the  object  was  created  and  the  count  of  the  number  of  objects 
created  on  that  processor  up  to  that  time.  It  is  important  to  note,  for  rea- 
sons that  will  be  explained  later,  that  this  implies  no  special  responsibility  of 


f 


76. 


Chapter  3t  heplemeiitatlons 


this  processor  for  the  object  once  It  hss  been  created. 


« a type  code.  This  field  distinguishes  between  actors,  strings,  numbers,  and 
other  primitive  types  in  the  system. 

« a length.  This  is  primarily  useful  for  planning  message  transfers  Involving 
the  text  of  the  object,  but  could  also  be  used  as  a first  cut  in,  say,  check- 
ing the  equality  of  two  strings. 

An  object  reference  containing  the  above  Information  could  fit  comfortably  into  64 
bits,  assuming  a modest-size  system.  This  allows  for  about  forty  bits  of  unique 
identifier.  A larger  system  might  need  more  space  for  unique  Identifiers,  but  it  Is 
hard  to  Imagine  that,  say,  100  bits  would  not  suffice  for  this  purpose. 

The  object  reference  format  Just  described  Is  one  that  would  be  suitable  for 
communication  between  machines.  Inside  each  processor,  various  tricks  obviously 
can  be  used  to  reduce  the  amount  of  storage  space  required  (and  also  facilitate 
access  to  other  information  about  the  object).  It  Is  convenient  to  convert  Incoming 
object  references  at  a site  by  looking  them  up  and/or  entering  them  In  a directory 
of  objects  known  at  that  site.  Once  this  Is  done,  the  Incoming  reference  may  be 
replaced  by  a much  shorter  (e.g.,  8-16  bit)  reference  which  is  unique  within  the 
processor,  and  which  Identifies  the  directory  entry  containing  the  full  information 
about  the  object.  Incoming  object  texts  may  thus  be  converted  to  the  much 
shorter  Internal  form,  and  the  full  inter^processor  form  regenerated  from  the  Internal 
form  when  a message  must  be  sent.  Obviously,  the  details  of  tMs  translation  can 
vary  between  sites,  depending  on  vsrious  attributes  of  the  host  mechines  (proces- 
sor architecture,  memory  size,  word  size),  as  long  as  some  standard  protocol  for 


Section  3.2.1 : Objects  and  Object  References 


77. 


communicating  with  naighbors  can  be  maintained. 

Similarly,  details  of  the  internal  format  of  object  texts  may  vary  from  one  pro- 
ceaaor  to  another,  but  one  or  more  external  standards  will  have  to  exist  for  com- 
munication between  processors.  For  concreteness,  we  describe  briefly  a set  of 
formats  used  by  computer  simulations  written  as  part  of  this  research. 

For  numbers,  there  is  no  text.  Instead,  a short  form  of  object  reference  is 
used,  containing  the  bits  of  the  number  and  a tag  identifying  the  object  as  a 
number.  ASCII  strings  are  also  supported.  Hera,  the  text  contains  the  bytes  of 
the  string,  followed  by  a zero  byte  (but  no  references  to  any  other  objects). 

Mu-expresslona  are  represented  using  a complicated  format  which  indicates 
the  number  and  identities  of  the  free  and  bound  variables  of  the  mu-expression,  fol- 
lowed by  the  number  of  events  In  the  body  and  references  to  the  components  of 
those  events.  Since  these  components  are  other  objects,  the  text  of  a mu- 
expreselon  may  actually  contain  references  to  other  objects  (for  example,  other 
mu-expresslons). 

A reasonable  mu-calculus  interpreter  Is  likely  to  work  by  creating  closures 
rather  than  by  substitution,  so  an  object  type  for  the  closure  of  a mu-expression 
(let  us  call  this  an  "actor")  Is  needed.  The  text  of  an  actor  is  Just  a sequence  of 
object  references.  The  first  is  a reference  to  the  mu-expression  being  closed,  the 
remainder  a Nat  of  references  to  the  values  of  free  variables  in  the  mu-expresslon. 
The  order  of  ttMse  values  In  this  list  Is  determined  by  the  representation  of  the 
mu-expreasion. 

The  representation  of  tokens  involves  three  kinds  of  objects:  read  sides, 
write  sides,  and  token  bodies.  The  text  of  a read  or  write  side  simply  contains  a 
reference  to  the  corresponding  token  body.  The  text  of  the  body  Is  organized  as 
a tible  somewhat  like  Figure  2.7,  containing  information  about  objects  sent  to  the 


78. 


Chapter  3:  Implementations 


r 


read  and  write  sides.  Since  the  text  of  a token  body  can  change,  multiple  copies 
cannot  be  made,  as  with  all  the  kinds  of  objects  discussed  up  to  now.  However, 
the  text  of  a token  body  can  be  partitioned  among  several  sites.  In  such  a way 
that  any  Individual  entry  appears  at  only  one  site.  Adding  a new  entry  then 
requires  notification  of  all  sites  containing  any  part  of  the  text  so  that  the  proper 
Interactions  can  occur-a  mechanism  by  which  this  can  be  accomplished  will  be 
described  later. 

The  representation  of  cells  is  similar  to  that  of  tokens  in  that  there  are  three 
kinds  of  objects  involved:  update  ports,  contents  ports,  and  cell  bodies.  As  in  the 
case  of  tokens,  the  text  of  an  update  or  contents  port  simply  contains  a reference 
to  the  relevant  cell  body.  A simple  scheme  for  representing  cell  bodies  is  one  in 
which  the  text  of  a cell  body  Is  Just  a reference  to  the  object  which  Is  the  current 
contents  of  that  cell.  Since  this  text  Is,  of  course,  mutable,  care  must  be  taken 
that  there  be  only  one  copy  of  It  at  any  time.  Thus  If  activities  on  several 
different  processors  are  all  accessing  the  same  cell  body,  a certain  amount  of  com- 
munication overhead  will  be  Incurred.  Ways  of  ameliorating  this  situation  will  be  dis- 
cussed later. 

The  representation  of  semaphores  could  obviously  be  similar  to  the  representa- 
tion of  cells,  with  additional  mechanism  for  handling  waiting  events.  The  details  of 
this  do  not  shed  much  additional  light  on  the  fundamental  concepts  of  this  chapter 
and  will  not  be  discussed  further. 


i 


1 

] 

I 

] 

j 

i 


i 


I 

j 


1 


! 


Section  3.2.1:  Objects  and  Object  References 


70. 


3.2.2:  Dynamics  of  the  Systom 

Tha  static  and  structural  aspects  of  our  Implementation  have  now  been 
described  in  sufficient  detail  that  we  can  proceed  to  consider  Its  dynamics,  that  is, 
the  motivation  and  mechanism  surrounding  the  scheduling  of  events,  sending  ot 
messages,  and  so  forth.  In  the  remainder  of  this  chapter,  we  shall  generally  be 
concerned  with  a "steady'State"  situation  in  which,  as  a result  of  some  unspecified 
history,  several  processors  in  the  system  have  been  assigned  things  to  do,  and 
need  to  communicate  with  other  processors  containing,  for  whatever  reason,  data 
upon  which  they  need  to  operate.  In  this  section,  therefore,  we  shall  concentrate 
on  developing  some  Intuition  for  such  a "steady-state"  situation,  briefly  exploring 
mechanisms  by  which  it  might  come  to  be  and  strategies  by  which  it  might  be 
managed. 

Let  us  assume  that  initially,  by  some  mesns  such  as  an  operator  typing  at  a 
keyboard,  one  processor  somewhere  In  the  network  has  been  given  an  event  to 
process.  As  we  have  seen,  each  processor  maintains  an  event  list,  so  this  proces- 
sor will  now  have  one  event  on  its  event  list.  The  goal  of  each  processor  is  to 
empty  its  event  Hat,  so  our  processor  will  take  the  new  event  off  its  list  and  see 
what  to  do  with  It.  Most  likely,  the  event  will  cause  some  computation  to  occur 
and  then  result  In  s new  event’s  being  added  to  tha  event  list,  whereupon  the 
whole  cycle  will  repeat.  As  long  as  this  situation  persists,  and  each  event  causes 
exactly  one  new  one  to  take  Its  place,  there  is  little  opportunity  for  other  proces- 
sors to  gat  into  the  act. 

Let  us  suppose  therefore  that  at  some  point  an  event  causes  two  or  more 
events  to  be  added  to  the  event  list.  Henceforth  there  will  be  several  events 
vying  for  our  processor’s  attention  at  the  same  time.  The  processqr  might  continue 
to  operate  on  all  the  events  Itself  by  always  taking  the  oldest  event  from  the 


80. 


Chapter  3:  Implementations 


event  list,  producing  its  consequent  events,  and  placing  these  at  the  end  of  the 
event  list.  To  reduce  queuing  and  dequeuing  overhead.  It  might  be  smarter  to  go 
through  several  generations  of  consequents  from  an  event  before  putting  the  new 
event(s)  back  on  the  event  list.  Another  way  to  speed  things  up  is  to  take  advan- 
tage of  the  other  processors  In  the  system.  If  there  are  enough  events  on  the 
event  list,  the  overhead  of  sending  some  of  them  to  a neighbor  should  be  worth  the 
Increased  processing  power  thus  brought  to  bear  on  the  problem. 

The  process  by  which  it  is  decided  what  events  to  execute  where  Is  the  sub- 
ject of  a subsequent  section;  before  taking  that  up  it  is  a good  idea  to  have  a 
short  look  at  the  mechanics  of  moving  events  (and,  as  a consequence,  objects) 
around.  Sending  an  event  to  another  processor  is  simple  enough-all  that  is 
required  is  to  send  a list  of  references  to  the  objects  participating  In  the  event. 
This,  however,  represents  only  a small  part  of  the  overhead  required  to  actually 
bring  that  processor  Into  the  action.  Before  the  new  processor  can  make  any 
sense  at  all  out  of  the  event,  it  will  need  a copy  of  the  text  of  the  receiving 
object  (the  first  object)  of  the  event.  If  it  does  not  happen  to  already  have  this 
text.  It  will  have  to  send  an  inquiry  for  it  to  some  . rocessor  which  does.  (This  will 
orobably  be  the  original  sender  of  the  event;  however.  It  may  be  that  no  neighbor 
of  the  inquirer  has  a copy  of  the  text  which  enables  it  to  reply  immediately  to  the 
inquiry.  In  this  case,  the  inquiry  must  be  forwarded  until  it  reaches  a processor 
capable  of  replying,  whereupon  the  reply  must  be  forwarded  back  to  the  original 
Inquirer.)  During  the  period  that  this  inquiry  is  being  sent  and  replied  to,  the  event 
cannot  be  processed.  Depending  on  the  nature  of  the  receiving  object,  further 
inquiry-response  cycles  may  be  required  to  gather  enough  Information  to  process 
the  event.  Finally,  when  the  event  Is  processed,  a new  event  or  events  will  most 
■Miy  be  generated,  containing  references  related  to  those  present  In  the  original. 


3.2.2:  Dynamics  of  the  System 


81. 


r 


Frequently,  the  texts  pf  these  objects  will  not  be  available  locally  either,  so  addi- 
tional inquiry-response  delays  may  ensue  as  subsequent  related  events  are  pro- 
cessed. Thus  the  true  overhead  associated  with  sending  an  event  to  another  pro- 
cessor Is  the  overhead  required  to  establish  a "working  set"  for  that  event  and  its 
consequents  on  that  processor. 

Having  exposed  the  drawbacks  of  sending  an  event  to  another  processor,  it  is 
worth  mentioning  some  mitigating  factors.  First,  if  the  event  really  represents  the 
beginning  of  a computation  that  will  proceed  for  a while  without  requiring  too  much 
communication  with  other  activities  on  the  same  processor,  using  another  processor 
can  still  come  very  close  to  doubling  the  effective  computing  speed  of  the  system. 
Second,  the  efRciency  of  the  system  can  probably  be  improved  greatly  through 
various  forms  of  tuning.  For  example,  the  sender  of  an  event  might  also  send  a 
selection  of  object  texts  likely  to  be  asked  for  by  the  receiver  In  the  process  of 
setting  up  its  "working  set."  Similarly,  the  reply  to  an  inquiry  might  not  be  Just  one 
object  text,  but  a collection  of  related  texts  Including  the  one  asked  for.  These 
strategies  would  not  do  much  to  reduce  the  number  of  bytes  sent  over  communica- 
tion lines,  but  would  reduce  the  number  of  wasteful  delays  between  sending  of 
inquiries  and  receipt  of  responses. 

Fortunately,  the  working  set  that  must  be  accumulated  by  an  event  running  on 
a new  processor  can  often  be  reewonably  compact.  Suppose,  for  Instance,  that  the 
event  is  a function  receiving  srguments  and  a continuation.  A working  sat  that  will 
allow  this  computation  to  proceed  for  a while  would  Include  the  body  of  the  func- 
tion (or  at  least  those  parts  of  the  body  which  will  be  exorcised  by  the  arguments 
given)  and  the  texts  of  the  arguments.  If  the  arguments  are  numbers,  no  extra 
Information  at  all  Is  required.  If  the  arguments  are  highly  structured,  it  Is  probable 
(depending  on  the  nature  of  the  function)  that  only  a portion  of  the  texts 


82. 


Chapter  3:  Implementations 


accessible  from  the  argument  references  will  ever  be  accessed.  Note  that  the 
text  of  the  continuation  is  not  needed  at  all-at  least  not  until  the  function  returns 
a value  to  its  continuation,  which  might  be  a natural  point  to  transfer  responsibility 
back  to  the  first  processor,  where  the  text  of  the  continuation  is  presumably 
located. 

To  summarize,  then,  we  can  imagine  the  system  beginning  to  operate  with  one 
event  at  some  particular  processor.  Activity  will  spread  to  other  processors  as  the 
parallelism  of  the  program  Increases  and  overloaded  processors  senki  events  to 
their  less-loaded  neighbors.  Perhaps  at  some  later  point  the  various  thrmds  of  the 
computation  will  either  die  out  or  begin  to  come  together  again,  perha\  using 
tokens  to  synchronize.  Thus  the  number  of  active  processors  might  shriniXuntil 
there  is  again  only  one  active  processor  with  one  event  to  process— perhaps  "^(^t 
out  the  answer."  Of  course,  another  possibility  Is  that  the  system  Is  constantl\ 
receiving  new  events  to  process  as  users  come  up  with  new  tasks  for  the  system 
to  perform.  This  will  lead  to  more  of  a steady-state  situation  where,  at  any  time, 
some  tasks  are  Just  beginning,  some  just  ending,  and  others  In  various  growing  or 
shrinking  phases.  In  either  case,  the  system  will  suffer  from  some  amount  of  over- 
head and  inefficiency  as  processors  build  up  working  sets  for  newly  acquired 
events,  but  hopefully  will  gain  even  more  from  the  application  of  extra  processing 
power  to  its  business. 


Section  3.2.2:  Dynamics  of  the  System 


83. 


3.9:  Objact  Managaniant 


I 


\ 

\ 

\ 

N 

\ 


\ 

\ 

\ 

\ 


As  Implied  by  the  suggestions  made  In  a previous  section  regarding  the 
representation  of  objects  Internal  to  processors,  It  Is  not  Intended  that  every  pro- 
cessor should  "know  about"  (I.e.,  contain  references  to)  every  object  in  the  sys- 
tem. In  fact,  It  Is  not  Irrtended  that  any  processor  should  need  to  know  about 
every  object  in  the  system.  These  two  constraints  complicate  the  problem  of 
object  management  in  our  system.  Fundamentally,  we  desire  that  local  changes  In 
the  status  of  an  object  should  require  only  local  processing.  Thus  if  a processor 
passes  a reference  to  some  object  X to  a neighbor  which  did  not  previously  con- 
tain any  references  to  X,  a local  adjustment  to  the  data  base  about  X,  preferably 
involving  only  the  two  processors  carrying  out  the  transaction,  should  sufRce.  A 
strategy  requiring  the  notification  of  all  processors  with  references  to  X that  a new 
member  had  Joined  their  club,  for  example.  Is  unacceptable. 

TMs  point  seems  fairly  trivial,  but  a related  point,  also  concerning  object 
management.  Is  less  so.  So  far,  we  have  discussed  the  significance  of  the  set  of 
processors  which  contain  references  to  an  object.  Some  subset  of  this  set  have  a 

! 

special  responsibility  with  regard  to  the  object,  however;  they  are  the  custodians 

of  copies  of  the  text  of  the  object.  (Objects  without  texts,  such  as  numbers,  are  : 

exempt  from  ail  the  considerations  discussed  in  this  section-if  the  reference  con- 

I 

tains  all  the  information  about  an  object,  no  fancy  bookkeeping  is  required  since  j 

I 

the  reference  will  obviously  be  present  wherever  the  object  is  referenced!)  | 

The  special  responsibility  Imposed  on  a processor  having  a copy  of  a text  is 
tarofold:  first,  the  processor  must  cooperate  with  all  other  custodians  of  texts  for 
that  object  to  insure  that  at  least  one  copy  of  the  text  exists  at  all  times; 
second,  the  processor  may  be  asked  to  respond  to  inquiries  from  other  processors 
not  having  copies  of  that  text.  Viewed  from  the  other  side,  a prooesaor  with  a 


64. 


Chapter  3:  Implementations 


reference  to  an  object  but  no  text  must  know  where  to  send  an  Inquiry  if  it 
becomes  necessary  to  obtain  a copy  of  the  text.  Severai  solutions  to  this  problem 
can  be  Imagined. 

A solution  that  can  be  rejected  almost  Immediately  is  to  have  one  designated 
processor  be  the  custodian  of  all  object  texts  in  the  entire  system.  Not  only  would 
this  be  a major  bottleneck  and  reliability  problem,  It  is  conceivable  that  no  single 
node  In  the  network  would  have  sufHclent  storage  to  hold  all  those  texts. 

The  next  solution,  and  a fairly  plausible  one.  Is  to  associate  with  each  object  a 
designated  processor  which  Is  to  be  the  custodian  of  its  text,  perhaps  the  proces- 
sor where  the  object  was  created.  This  answers  the  objections  raised  to  the  pre- 
vious scheme,  leaving  as  the  only  technical  problem  the  matter  of  properly  routing 
the  Inquiries  and  replies.  This  problem,  however,  has  been  solved  in  many  contexts 
(e.g.,  the  ARPANET[21  ]).  This  scheme  Is  still  open  to  objections,  though.  For  one, 
a processor  may  send  to  the  designated  custodian  for  a text  which  In  fact  Is  also 
present  at  a much  nearer  processor.  There  Is  no  mechanism  for  maintaining  Infor- 
mation which  might  help  determine  the  most  appropriate  target  for  an  Inquiry.  This 
might  not  be  a serious  problem  on,  say,  a ring  or  Ethernet,  where  all  processors  are 
approximately  the  same  "distance"  apart,  but  becomes  a more  serious  considera- 
tion If  long-distance  messages  must  be  forwarded  by  many  processors. 

Another  objection  is  that  the  custodian  of  an  object  cannot  easily  be  changed. 
In  violation  of  our  desiderata  stated  above,  if  primary  custody  of  a text  passes 
even  from  a processor  to  Its  immediate  neighbor,  all  possessors  of  references  to 
the  corresponding  object  must  be  notified.  Falling  that,  a "forwarding  address" 
must  be  left  with  the  old  custodian,  leading  to  extra  delay  In  responding  to 
Inquiries,  and  Imposing  an  extra  burden  on  the  old  custodian  that  may  nullify  some 
of  the  economies  achievable  by  transferring  custody  In  the  first  pisce.  The  scheme 


Section  3.3:  Object  Management 


86. 


to  bo  presontod  boloiw  does  incorporate  the  notion  of  custody,  but  many  proces- 
sors can  serve  as  custodians  for  the  same  object  at  the  same  time,  and  all  custo- 
dians are  In  principle  equal.  Furthermore,  although  forwarding  Information  does  need 
to  exist,  it  must  only  be  kept  by  processors  actively  concerned  with  an  object;  no 
permanent  obligation  rests  with,  for  example,  past  custodians  for  the  object. 

To  summarize,  then,  wg  envision  a scheme  in  which  at  any  time  some  subset  of 
the  processors  in  the  system  may  possess  references  to  a particular  object,  and 
some  subset  of  those.  In  turn,  will  be  custodians  of  copies  of  Its  text.  Ideally,  this 
situation  should  be  completely  fluid;  only  those  processors  which  need  (for  pur- 
poses of  their  own  Internal  operation)  to  have  references  to  an  object  ought  to  be 
forced  to  keep  them,  and  processors  with  no  further  need  to  have  the  text  of  an 
object  ought  not  to  be  required  Vo  hang  onto  it.  This  should  happen  without  regard 
to  the  ancient  history  of  the  object  (l.e.,  the  creator  of  an  object  ought  not  to  be 
forced  to  accept  any  special  respi>nsibility  for  its  continued  existence).  Finally,  a 
local  change  in  the  set  of  processors  knowing  about  or  having  copies  of  the  text 
of  an  object  should  have  only  local  ramifications.  The  scheme  to  be  presented 
below  satisfies  several  of  these  criteria  to  a greater  extent  than  the  schemes  dis- 
cussed so  far.  Nevertheless,  it  still  does  not  have  all  of  the  desired  characteris- 
tics, and  there  is  clearly  room  for  further  improvement. 

3.3.1:  Reference  Trees 

The  object  management  algorithm  developed  in  this  research  Involves  maintain- 
ing the  processors  which  contain  references  to  an  object  in  a connected,  ecyclic 
giaph  cal'ed  the  reference  tree  for  that  object.  Each  reference  tree  consists  of 
some  subset  of  the  nodes  and  arcs  (processors  and  Inter-neighbor  links)  of  the  net- 
work. The  nodes  which  belong  to  the  reference  tree  are  chosen  to  be  those 


86. 


Chapter  3:  Impiementatlorts 


processors  In  the  network  which  contain  references  to  the  object,  and  the  arcs 
are  chosen  In  such  a way  that  (1)  the  reference  tree  Is  connected,  l,o.,  It  Is  possi- 
ble to  reach  any  node  In  the  tree  from  any  other  node,  traveling  only  over  arcs 
that  are  In  the  tree,  and  (2)  the  tree  Is  acyclic  In  that  the  arcs  In  the  tree  form  no 
closed  loops.  (Note  that  the  arcs  In  the  reference  tree  are  undirected,  hence 
requirement  (2)  means  that  there  should  be  no  undirected  cycles.)  Put  another 
way,  there  Is  a unique  path  (using  only  arcs  In  the  tree)  from  any  node  In  a refer* 
ence  tree  to  any  other.  One  additional  consideration  which  is  not  obvious  from  the 
above  description  Is  that  every  arc  In  a reference  tree  goes  between  two  nodes 
that  are  In  the  tree-no  reference  tree  can  Include  arcs  between  two  nodes  not  In 
the  tree,  or  even  between  one  node  in  the  tree  and  one  node  not  In  It.  Reference 
trees  are  so  named  because  they  form  unrooted  trees  embedded  in  the  network 
(see  Figures  3.2  and  3.3). 

It  Is  Important  to  note  that,  in  general,  the  reference  trees  for  different 
objects  need  bear  no  relation  to  each  other.  In  particular.  It  Is  not  the  case  that 
there  Is  one  "reference  tree"  in  the  network,  used  for  all  objects.  Central  to  the 
concept  of  reference  trees  Is  that  they  are  free  to  grow  and  shrink  dynamically, 
following  changes  In  the  roles  of  the  corresponding  objects  In  the  operation  of  the 
aystem. 

Also  significant  Is  the  fact  that  reference  trees  can  be  maintained  by  a com- 
pletely distributed  mechanism  In  which  each  processor  In  a tree  remembers  only  the 
state  of  Its  Immediately  adjacent  links  (i.e.,  whether  each  link  Is  in  the  tree  or  not). 
Processors  not  in  the  tree  for  an  object,  of  course,  have  no  references  to  the 
object,  and  need  remember  no  Information  about  It.  Even  the  cycle-free  nature  of 
the  tree  can  be  preserved  on  the  same  strictly  distributed  basis— no  central  clear- 


Section  3.3.1 : Reference  Trees 


87. 


Rour*  3.2:  ExamplM  of  rofaronco  traas  (In  haavy  llnaa) 


inghousa  is  naadad  to  datarmlna  whether  a cycle  is  being  formed. 

As  mentioned  previously,  soma  subset  of  the  processors  having  references  to 
an  object  will  also  be  custodians  of  a copy  of  the  text  of  the  object.  In  our 
currant  schema,  this  means  that  some  subset  of  the  rKXtos  In  any  reference  tree 
will  in  addition  fill  the  special  role  of  custodians.  In  fact,  the  primary  reason  for 
requiring  reference  trees  to  be  connected  is  so  that  any  processor  with  a refers 
ence  to  an  object  can  always  communicate  with  a custodian  for  that  object  simply 
by  following  links  that  are  part  of  Its  reference  tree.  Indeed,  we  will  require  that 
a//  communictilon  conemrning  an  object  tram!  strictly  over  links  thst  are  In  the 
reference  tree  for  thst  object.  The  only  exception  to  this  rule  occurs  when  a new 
iMde  la  in  the  prooeas  of  being  added  to  the  tree;  this  case  will  be  discussed  in 


68. 


Chapter  3:  Implementations 


Figure  3.3:  Examples  that  are  not  reference  trees  j 

i 


more  detail  in  the  next  section. 

Given  that  all  requests  for  the  text  of  an  object  will  travel  through  the  refer- 
ence tree  for  that  object,  and  given  that,  if  routed  properly  through  the  tree,  auch 
requests  are  guaranteed  to  eventually  reach  a custodian  of  the  text,  the  question 
remains  of  how  to  make  sure  such  a request  is  Indeed  routed  properly.  It  is  here 
that  the  acyclic  nature  of  the  tree  comes  Into  play.  For  every  adjacent  link  that  a 
processor  believes  to  be  part  of  the  reference  tree  for  some  object,  that  proces- 
sor will  maintain  an  additional  piece  of  information  indicating  whether  any  proceasor 
reachable  through  that  tree  starting  with  that  link  Is  a custodian  for  the  text  of 
that  object.  Since  the  tree  is  connected.  It  Is  guaranteed  that  if  a custodian 
exists,  it  is  reachable  from  any  node  In  the  reference  tree  starting  from  some  link. 
Since  the  tree  has  no  undirected  cycles.  It  Is  certain  that  the  sets  of  processors 


Section  3.3.1:  Reference  Trees 


80. 


reachable  starting  with  different  links  are  disjoint.  The  strategy  for  initiating  or  for- 
warding an  Inquiry  is  thus  simply  to  pick  some  link  from  which  a custodian  can  be 
reached,  and  sand  the  message  in  that  direction.  In  addition,  any  processor  for- 
warding an  Inquiry  will  need  to  remember  the  direction  from  which  the  inquiry  came, 
so  that  the  reply  can  be  routed  back  along  the  same  route.  Since  the  tree  has  no 
cycles,  neither  the  inquiry  nor  the  reply  message  can  find  itself  traveling  around  a 
loop,  and  consequently  each  will  eventually  reach  its  destination  (assuming  mes- 
sage forwarders  are  clever  enough  never  to  route  an  inquiry  back  to  its  sender, 
even  though  there  might  be  a custodian  in  that  direction  too!). 

So  far,  we  have  discussed  the  reference  tree  simply  as  a static  structure, 
with  no  indication  of  how  it  can  grow  or  shrink,  or  of  how  custody  of  a text  may  be 
changed.  These  topics  are  explored  In  the  next  section;  for  now  it  suffices  to 
note  that.  In  accordance  with  our  desiderata,  local  changes  require  only  local 
modifications  to  the  distributed  data  base  which  serves  to  represent  the  tree. 
Even  this  scheme,  however,  still  falls  short  of  our  goals  In  several  ways. 

One  immediately  obvious  liability  of  the  reference  tree  approach  arises  from 
the  requirement  that  the  tree  remain  connected.  This  requirement  means  that  if 
two  far-apart  processors  both  need  to  maintain  a reference  to  an  object,  several 
processors  between  them  (a  sufficient  number  to  keep  the  two  ends  conrtected) 
must  also  be  part  of  the  tree,  even  though  they  heve  no  other  Invohrement  with 
the  object.  The  impact  of  this  is  not  tremendous,  for  the  amount  of  overhead 
involved  In  siembership  in  a reference  tree  Is  not  large.  In  any  scheme,  presunn- 
ably,  the  severel  processors  between  would  be  involved  in  forwarding  messages 
from  one  and  to  the  other,  so  the  only  extra  overhead  Imposed  by  the  reference 
tree  mechanism  (assuming  our  ceNuiar  network  es  the  underlying  hardware)  is  the 


30. 


Chapter  3:  Implementations 


static  storago  overhead  of  remaining  part  of  the  reference  tree. 


The  impact  of  this  overhead  can  be  further  reduced  by  several  mechanisms. 
First,  If  the  network  Is  fairly  "bushy"  In  structure,  as  opposed  to,  for  example,  a 
long  chain  of  processors,  then  the  distance  (in  terms  of  the  number  of  links 
traversed)  between  any  two  processors  will  be  reduced,  cutting  the  length  that 
such  long  reference  chains  might  grow  to.  Second,  If  some  processors  tend  to  run 
along  "trunk  lines”  while  others  tend  to  occupy  "terminal"  positions  connected  to 
only  one  or  two  other  processors,  there  v/ill  be  a natural  division  of  labor  with  the 
"trunk"  processors  bearing  moat  of  the  load  of  maintaining  long  reference  trees. 
Such  processors  might  be  specifically  designed  with  this  service  role  In  mind. 
Third,  some  event  distribution  strategies  will  tend  to  prevent  such  long  reference 
chains  from  developing.  Finally,  It  Is  possible  under  some  circumstances  to  remove 
the  requirement  that  a reference  tree  remain  connected-thls  will  be  discussed  In 
Section  3.3.6. 

The  other  restriction  that  applies  to  reference  trees  is  that  they  must  be  kept 
free  of  cycles.  This  Is  not  too  difflcult-the  method  will  be  discussed  below.  The 
prohibition  of  cycles  causes  some  other  problems,  though,  related  to  the  fact  that 
even  though  two  neighbors  are  part  of  a reference  tree,  the  link  between  them 
may  rK>t  be  (if  its  addition  to  the  tree  would  close  s cycle).  We  have  required  that 
all  communications  concerning  an  object  travel  strictly  along  the  links  in  its  refer- 
ence tree.  Thus  a processor  might,  for  example.  Issue  an  Inquiry  along  some  link  In 
the  tree;  that  inquiry  might  travel  a long  distance  before  it  could  be  satisited.  It 
is  possible,  however,  that  sending  the  inquiry  over  some  link  not  In  the  tree  would 
result  In  Its  reaching  a neighbor  In  the  tree,  and  that  the  request  could  be  satisfled 
Immediately  by  that  neighbor  (see  Rgure  3.4).  This  Is  just  one  Illustration  of  the 
various  ways  In  which  a reference  tree  can  fall  to  be  optimal;  the  questlor)  of  how 


Section  3.3.1:  Reference  Trees 


01. 


7/w  solM  box  represents  a processor  with  a copy  of  the 
objacfs  text;  the  shaded  box  represents  a processor  Inquir- 
ing for  the  text. 

Figure  3.4:  A non-optimal  reference  tree 


to  reorganize  the  tree  to  a better  shape  will  be  discussed  later. 

I 

3.3.2:  Reference  Tree  Maintenance  | 

I 

Thia  aaction  describes  sn  Interprocessor  communication  protocol  erhich  may  be 
used  to  grow  and  shrink  reference  trees  while  preserving  the  required  connected- 
ness STKl  freedom  from  cycles.  Also  described  is  another  protocol,  fairly  indepen- 
dent of  the  first,  which  may  be  used  for  communicating  and  managing  custody  of 
object  texts.  These  protocols  turn  out  to  be  fairly  intricate  and  inelegant.  How- 
ever, they  do  %vork  (further  Justification  of  this  claim  and  the  story  of  how  these  j 

protocols  were  developed  are  presented  in  Appendix  A).  The  protocols  grew  to 
their  present  level  of  complexity  due  to  the  propensity  of  multiprocessor  systems 
to  deadlock  and/or  reach  incorwlstent  states  through  unfortunate  timing  of  indepen- 
dently initiated  requests.  While  it  Is  hard  to  bellove  thst  there  do  not  exist  simpler 
protocols  which  will  do  the  Job.  several  simpler  protocols  were  considered  and 


02. 


Chapter  3:  Implementations 


rejectod  in  the  process  of  developing  these.  In  the  discussion  below,  we  explore 
the  reasons  for  the  failure  of  these  simpler  approaches  and  the  motivation  for  our 
current  scheme.  One  final  note:  the  protocols  discussed  here  make  no  attempt  to 
recover  from  damaged  or  lost  messages,  or  messages  arriving  out  of  order.  These 
problems  can  be  solved  by  various  well-known  means[21]  which  may  be  assumed  to 
provide  the  underlying  protocol  on  each  link.  It  may  be  that  minor  modifications  of 
the  concepts  to  be  presented  would  eliminate  the  necessity  for  any  such  underly- 
ing protocol,  but  such  work  would  further  complicate  the  algorithms  presented 
below  and  was  Judged  to  be  beyond  the  scope  of  this  research. 

3.3.2.1:  Changes  in  Reference  Tree  Membership 

This  protocol  is  concerned  with  maintaining  that  part  of  the  reference  tree 
data  base  devoted  to  recording  whether  or  not  particular  processors  or  links  are 
members.  Custody  issues  play  no  direct  role  here,  although  changes  in  custody  will 
often  be  the  cause  of  various  transactions  of  the  kinds  described  below. 

The  membership  protocol  Involves  six  basic  kinds  of  messages,  whose  meaning 
Is  roughly  as  given  In  Table  3.5.  Each  message  is,  of  course,  specialized  by 
Identification  of  the  object  (and  hence  reference  tree)  that  It  pertains  to. 

Message  Meaning 

R+  object  reference  plus  request  to  add  link 
R-  object  reference  without  request  to  add  link 
add  link  to  tree 

- request  to  drop  link  from  tree 
A4'  positive  acknowledgment 
A-  negative  acknowledgment 

Table  3.6:  Membership  protocol  message  types 
Those  message  types,  of  course,  derive  their  exact  meaning  from  the  way  In  which 


Section  3.3.2.1:  Changes  In  Reference  Tree  Membership 


93. 


1 

I 
i 

I 

th«y  are  used  by  the  protocol,  but  a few  general  comments  are  In  order  regarding 
the  circumstances  under  which  the  messages  may  be  sent.  A-*'  and  A-  messages 
are  sent  only  In  response  to  other  messages.  and  - massages  are  sometimes 
sent  spontaneously,  and  are  sometimes  sent  in  response  to  R+  messages.  R*  and 
R-  messages  are  always  sent  spontaneously;  that  Is,  In  response  to  some  external 
stimulus,  such  as  the  need  to  send  a text  containing  a reference  to  the  object, 
rather  than  In  response  to  any  of  the  messages  listed  in  Table  3.5.  R-  is  the  only 
kind  of  message  that  never  provokes  a response,  and  corresponds  to  communicat- 
ing a reference  to  the  object  over  a link  which  has  at  least  provisionally  been 
Included  In  the  reference  tree  for  the  object.  R-  messages  cannot  be  sent  under 
all  circumstances,  however;  sometimes  an  R'f  message  is  required  to  establish  the 
Intent  to  add  the  link  in  question  to  the  reference  tree. 

The  membership  protocol  operates  by  associating  with  each  end  of  each  link  a 
state  for  each  possible  object.  In  terms  of  Implementation,  each  processor  must 
maintain  a data  base  for  each  object  it  has  a reference  to.  Indicating  the  state 
(with  respect  to  that  object)  of  each  link  adjacent  to  the  processor.  It  is  impor- 
tant to  realize  that  the  two  processors  at  the  ends  of  a link  may  have  different 
ideas  of  the  state  of  the  link;  this  may  be  as  the  result  of  some  intentionally  intro- 
duced asymmetries  discussed  below,  or  it  may  occur  if  messages  regarding  the 
object  have  been  sent  at  one  end  of  the  link  but  not  yet  received  at  the  other. 

The  possible  states  may  be  grossly  characterized  as  being  either  stable  or 
transient.  Stable  states  are  states  which  might  be  expected  to  persist  over  a 
relatively  long  period  of  time.  Transient  states  are  those  in  which  a message  has 
been  sent  across  the  link  and  a reply  la  expected;  the  reply  will  cause  a transi- 
tion to  some  other  state,  either  stable  or  transient.  Of  the  two  kinds  of  states,  the 
stable  states  are  the  more  important;  the  transient  states  exist  to  provide  the 


94. 


Chapter  3:  Implementations 


AD-A054  009  MASSACHUSETTS  INST  OF  TECH  CAMBRIDGE  LAB  FOR  COMPUTE— ETC  F/6  17/2 
MULTIPLE-PROCESSOR  IMPLEMENTATIONS  OF  MESSAGE-PASSING  SYSTEMS. (U) 

APR  78  R H HALSTEAD  N00014-75-C-0661 

UNCLASSIFIED  MIT/LCS/TR-198  NL 


« 


1 


r 

i 

1 

i 

I 

proper  sequence  of  events  so  that  the  next  pair  of  stable  states  to  be  esta- 
blished Is  consistent  and  does  not  result  in  partitioning  the  tree  or  closing  a cycle. 
For  the  purposes  of  the  discussion  below,  the  states  have  been  given  one-  to 
three-character  mnemonic  names. 

Perhaps  the  state  most  likely  to  occur  is  X,  which  Indicates  that  not  only  Is  the 
link  not  considered  part  of  the  reference  tree,  the  processor  does  not  even  con- 
tain any  references  to  the  object.  Clearly  If  a processor  Is  in  state  X with  respect 
to  a given  object  for  one  link,  the  processor  should  be  In  state  X (or  the  transient 
state  X7)  with  respect  to  that  object  over  every  link. 

A state  closely  related  to  X Is  N.  In  state  N,  the  processor  does  contain  refer- 
ences to  the  object,  but  does  not  believe  the  link  in  question  to  be  part  of  the 
object's  reference  tree.  This  state  may  come  about  either  because  the  processor 
is  the  only  processor  to  contain  any  references  to  the  object,  or  because  the  pro- 
cessor is  connected  to  the  reference  tree  by  some  other  link  or  links. 

Another  stable  state  is  M,  which  Indicates  that  the  link  In  question  Is  believed 
to  be  part  of  the  reference  tree,  and  furthermore  that  this  processor  Is  currently 
the  master  of  that  link  (for  transactions  involving  that  object).  The  master  of  a link 
is  the  only  one  that  can  effect  changes  in  the  status  of  the  link-thls  asymmetry 
seems  to  be  necessary  to  prevent  confusion  resulting  from,  for  instance,  both  ends 
of  a link  simultaneously  attempting  to  terminate  their  connection  with  the  reference 
tree. 

In  a stable  condition,  the  state  at  the  other  end  of  a link  from  M may  be  S,  tor 
"slave."  A processor  in  state  S cannot  directly  cause  a change  In  the  status  of 
the  link;  it  may  however  request  the  master  to  commence  a change,  and  it  may 
respond  to  changes  ordered  by  the  master. 

The  final  stable  state,  closely  related  to  8,  Is  SR.  A processor  will  go  Into 


Section  a.3.2.1:  Changes  In  Reference  Tree  Membership 


S6. 


state  SR  from  state  S upon  sending  a reference  to  the  object  (R-  massage)  over 
the  link.  It  Is  necessary  to  remember  whether  such  a reference  has  been  sent  by 
the  slave  because  it  modifies  the  proper  response  to  an  attempt  by  the  master  to 
terminate  the  connection.  This  Is  because  the  master  might  attempt  to  terminate 
the  connection  before  receiving  the  R-  message  (which  caused  the  transition  to 
state  SR),  and  the  response  by  the  slave  to  this  attempt  should  take  into  account 
the  possibility  that  that  message  might  be  received  at  the  other  end  after  the 
attempt.  Under  such  circumstances,  allowing  the  link  to  be  broken  could  cause  a 
partitioning  of  the  reference  tree. 

This  completes  our  discussion  of  the  five  stable  states.  The  transient  states 
will  not  be  described  to  this  level  of  detail.  For  the  most  part,  they  acquire  their 
meaning  from  the  stable  states  that  precede  them  or  to  which  transitions  can  be 
made.  Instead  of  attempting  to  describe  the  meaning  of  these  states,  we  present 
a compiete  state-transition  table  (Table  3.6)  and  diagram  (Figure  3.7),  and  summar- 
ize below  the  normal  sequences  for  effecting  various  kinds  of  state  changes. 

The  fundamental  principle  that  motivates  this  protocol  design  (other  then  the 
need  to  maintain  the  link  data  base  in  a consistent  state)  Is  that  a processor  must 
siwsys  be  able  to  send  a reference  (of  either  the  or  R-  variety,  vrhichever  Is 
appropriate)  over  any  link  without  proarrangement.  In  other  words,  It  Is  not  sccept- 
able  that  the  sending  of  a reference  should  be  part  of  some  transaction  beginning, 
say,  with  the  sending  of  some  other  message,  and  allowing  the  reference  to  be 
sent  only  upon  receipt  of  a suitable  reply.  In  order  to  understand  this  requirement, 
we  must  examine  the  circumstances  under  which  references  may  be  sent. 

In  general,  a reference  will  be  sent  as  part  of  aosie  event  or  text  which  is 

V 

being  communicated  between  prooeasors.  Sanding  a text  Involves  communicating 
two  kinds  of  references:  the  rafarenoe  to  the  object  whoaa  text  Is  being  sent. 


06. 


Chapter  3:  Implemerrtations 


State 

R-f 

Transition  Upon  Receiving 

R-  ♦ - A+  A- 

Spontaneous 

Transitions 

X 

N 

M 

8 

SR 

*:S 

-:IIT1 

M 

8 tM  A>:NT1 

SR  :M  A+:ST 

:N 

R-f:IIIIT  :X 

R-;lll  *:S  -;XT 

R-:8R 

R-:8R 

XT 

XT  A-^:M  A-:X 

R-:XT  :IIT 

M? 

NT  A-:NT1  A-:N 

R-:NT  :XT 

M71 

:»IT1  :N 

R4>:MT2 

M? 

-tlllTI 

MT  :M  A-:N 

R-:MT 

M?1 

MT1  A-:NT1 

R-:MT1 

M?2 

MT2  :I|IT 

R-:MT2 

ST 

ST  :S  A-:N 

R-.SRT 

8RT 

SRT  :SR  A-:N 

R-:SRT 

7/m  notation  a.*b  means  that  under  the  specified  circumstances,  a transition  to  state 
b can  occur  with  the  emission  of  message  a. 

Table  3.6:  Membership  protocol  state  transition  table 


and  references  to  other  objects  referred  to  in  the  text.  (Sending  an  event  only 
requires  communication  of  references  to  the  objects  composing  the  event.)  Thus 
obtaining  clearance  to  send  a text  or  event  may  Involve  simultaneously  obtaining 
clearance  to  send  several  object  references.  Unless  every  processor  always  has 
clearance  to  send  any  object  reference,  it  Is  easy  to  see  how  the  piecemeal 
aggregation  of  such  clearance  could  lead  to  a deadlock  on  the  link.  This  is  espe- 
cially true  when,  as  is  the  case  with  this  protocol,  transactions  involving  different 
objects  are  completely  independent-no  overall  master-slave  relationship  applies  to 
all  communication  over  a particular  (ink,  for  example. 

This  need  to  avoid  deadlock  is  one  of  the  primary  factors  acting  to  complicate 
the  protocol  design,  and  requires  that  any  processor  always  be  able  to  send  any 
object  reference  without  the  possibility  of  confusing  the  processor  at  the  other 
side.  The  only  exception  to  this  requirement  is  state  X-lf  a processor  has  no 


Section  3.3.2. 1:  Changes  In  Reference  Tree  Membership 


67. 


I 


Trmmnioot  om/smO  by  nomlvlng  R-  messages  have  been  omitted.  Transitions  accom- 
panying the  sending  of  an  R-  message  have  been  omitted  where  the  state  after 
sending  the  message  la  the  same  as  the  state  before. 

FIgur*  3.7:  Membership  protocol  state  transition  diagram 

references  to  an  object,  It  has  none  to  send! 

We  now  turn  to  the  various  state  changes  that  may  occur,  and  why  those 

changes  may  be  appropriate.  We  start  erlth  a processor  In  state  X,  having  no 

j 

I references  to  the  object.  The  only  kind  of  message  that  can  be  received  In  state 

I X is  an  message  from  some  processor  attempting  to  extend  the  reference  tree 


for  the  object  as  a result,  perhaps,  of  sending  an  event  mentioning  the  object. 
Upon  receipt  of  the  R*  message,  a ♦ message  Is  returned  as  an  Indication  that  the 
link  should  Indeed  bo  added  to  the  tree,  and  a transition  Is  made  to  state  8 in  anti- 


cipation  of  the  sender  of  the  R-f  message  entering  the  M (master)  etate  when  it 
receives  the  * message.  Simultaneously,  the  states  of  all  other  links  to  the  pro* 
oessor  change  from  X to  N,  indicating  that  the  processor  Is  now  part  of  the  refer* 
ence  tree,  but  that  It  does  not  believe  any  of  its  other  Nnks  to  be  part  of  this  tree. 
(Also,  any  links  In  state  X7  change  to  N?.)  As  may  be  noticed  from  Table  3.6  or 
Figure  3.7,  the  state  change  between  X and  N does  not  require  any  notification  of 
the  party  at  the  other  end  of  the  link. 

Now  that  our  processor  is  part  of  the  reference  tree.  It  may  need  to  attempt 
to  further  extend  the  tree  by  sending  a reference  along  one  of  the  links  Just  con* 
verted  to  state  N.  From  state  N an  object  reference  must  be  sent  as  an  R-*-  mes- 
sage. Upon  sending  the  message,  the  sender's  state  for  that  link  changes  from  N 
to  the  transient  state  M?,  awaiting  a reply.  While  In  state  M?,  additional  refer- 
ences may  be  sent  as  R-  messages  If  necessary.  The  reply  to  R-t-  depends  on  the 
condition  of  the  processor  at  the  other  end  of  the  link.  If  It  was  In  state  X,  It 
changes  to  S and  replies  with  ■»■,  as  described  above.  Upon  receiving  the  mes- 
sage, the  sender  of  the  R-f  message  change  from  M?  to  M,  and  the  link  has  been 
established.  If  the  other  processor  is  in  state  N (also  possibly  M7)  then  the  link 
cannot  be  added  to  the  reference  tree  because  It  would  close  an  undirected  cycle 
(since  both  processors  are  connected  by  some  other  route  already  in  the  refer- 
ence tree).  Consequently,  the  other  processor  responds  negatively,  with  a - mes- 
sage. When  the  sender  of  the  R>  message  receives  the  - message,  it  moves  back 
to  state  N while  emitting  the  negative  acknowledgment  A-.  When  It  receives  the 
A-,  the  other  processor  then  returns  to  state  N from  N?1,  a transient  state  It  had 
entered  after  sending  the  - meesage.  The  extra  level  of  acknowledgment  here  is 
needed  because  a proeeaaor  In  state  M?  may  send  object  references  as  R-  mes- 
sages, a capability  It  needs  to  have.  The  other  processor  must  be  prevented  from 


Section  3.3.2. 1:  Changes  In  Reference  Tree  Membership 


00. 


raturnlng  to  state  N,  or  worse,  to  state  X,  before  Its  reply  Is  known  to  the  orlglna^ 
Ing  processor;  otherwise,  It  would  be  confused  by  receiving  these  R-  messages. 
Consequently,  the  responding  processor  Is  held  in  state  N?1,  In  which  it  Is  able  to 
accept  R-  messages,  until  the  acknowledgment  A-  arrives,  certain  to  have  followed 
any  R-  messages  that  might  have  been  sent.  Processors  In  states  N and  X cannot 
accept  R-  messages  because  they  are  not  requests  to  extend  the  reference  tree, 
and  if  they  were  interpreted  as  such,  could  cause  confusion  resulting  In  the  parti* 
tkming  of  the  reference  tree. 

Another  possible  scenario  is  that  two  processors,  both  In  state  N (for  the  same 
link)  might  simultaneously  attempt  to  add  that  link  to  the  tree  by  sending  R-»  mes- 
sages to  each  other  and  entering  state  M?.  Under  these  circumstances,  it  is  clear 
that  the  link  should  not  be  added,  to  prevent  forming  a cycle.  Thus  each  M7  will 
reply  to  the  R-t*  with  a - message  and  a transition  to  Mtl.  Receipt  of  the  - mes- 
sages will  prompt  the  sending  of  A-  messages  and  transitions  to  1171.  Finally,  when 
the  A-  messages  are  received,  both  processors  will  revert  to  state  N. 

M72,  the  only  other  state  In  the  M7  complex,  is  required  because  a processor 
in  state  N71  (waiting  for  an  A-  acknowledgment  before  returning  to  state  N)  may 
find  it  necessary  to  send  out  an  object  reference.  Since  the  Hnk  Is  not  considered 
to  be  part  of  the  reference  tree  at  this  moment,  an  R-f  message  must  be  sent. 
Consequently,  state  117  (awaiting  a response  to  the  request  to  add  the  Hnk  to  that 
object’s  reference  tree)  should  be  entered  after  receiving  the  expected  A-  mee- 
sage.  The  function  of  state  M72  is  to  wait  until  the  A-  Is  received,  since  an  A- 
message  cannot  be  handled  in  state  M7. 

Once  it  has  bean  agreed  that  a Hnk  is  part  of  the  reference  tree  for  an  object 
and  things  have  settled  to  a quiescent  state  (l.e.,  no  messages  sre  In  transit),  one 
processor  (the  master)  will  be  In  state  M and  the  other  (the  slave)  in  state  8 (or 


100. 


Chapter  8:  Implementations 


possibly  SR).  It  is  s simple  matter  to  reverse  the  roles  of  master  and  slave,  but 
the  transaction  must  be  Initiated  by  the  master.  The  master  sends  a * message 
and  enters  state  S.  When  the  slave  receives  the  message,  It  enters  state  M. 
Since  only  the  master  can  initiate  this  transaction.  If  a slave  wishes  to  become 
master  (which  It  may  for  reasons  discussed  below)  It  must  first  use  some  mechart* 
ism  outside  this  protocol  to  Induce  the  master  to  begin  the  role-reversal  (such  as 
sending  some  "request-for-mastery"  message). 

Of  course,  both  master  and  slave  can  freely  send  object  references  (In  the 
form  of  R-  messages)  to  each  other.  In  the  case  of  a slave,  the  sending  of  a 
reference  is  accompanied  by  a transition  to  state  SR,  for  reasons  explained  below. 

Having  seen  how  a link  may  be  established  to  be  in  the  reference  tree,  we 
now  come  to  the  question  of  how  a link  may  be  deleted  from  the  tree.  Due  to  the 
connected,  acyclic  nature  to  the  tree.  It  Is  true  that,  as  we  have  seen  above, 
every  time  a link  Is  added  to  the  tree  a new  node  (processor)  is  added  also.  Simi- 
larly, every  time  a link  is  deleted,  a node  is  also  being  removed  from  the  tree. 
Thus  the  only  reason  for  deleting  a link  is  because  a node  wants  to  remove  itself 
from  the  reference  tree.  This  In  turn  will  be  caused  by  that  processor's  discovery 
that  ruMie  of  the  references  It  has  to  the  object  are  reachable  from  any  active 
data  on  that  processor.  In  other  words,  the  reference  is  garbage-collectable  on 
that  processor.  A later  discussion  of  garbage  collection  will  make  more  precise  the 
conditions  under  «vhlch  a node  Is  allowed  to  remove  itself  from  the  tree;  for  now, 
we  observe  simply  that  It  must  be  a leaf  node  of  the  tree.  Equivalently,  It  must 
have  only  one  neighbor  also  In  the  tree.  Otherwise,  removal  of  the  node  would  par- 
tition the  tree,  since  the  node  forms  the  only  connection  (within  the  tree)  between 
Its  various  neighbors.  Thus  a processor  may  attempt  to  remove  Itself  from  the  tree 
only  if  all  Its  links  but  one  are  In  state  N (or  N?).  Additionally,  that  one  link  must 


Section  3.d.2.1;  Changes  In  Reference  Tree  Membership 


101. 


b«  in  state  M;  If  the  processor  Is  currently  s sieve  on  thst  link  (for  thst  objoct),  it 
must  first  Induce  the  msster  of  the  link  to  relinquish  its  msstery. 

A msster  requests  to  remove  Itself  from  the  tree  by  sending  s - messsge  to 
Its  sieve  end  chenging  to  stste  X?  (simultsneously  all  N links  from  thst  processor 
should  change  to  X end  all  N7  links  to  X7).  If  the  sieve  is  in  state  S,  it  accepts 
the  deletion  by  responding  urith  A-  and  changing  to  state  1171;  the  standard  link- 
refusal  sequence  takes  over  from  there.  If  the  slave  is  in  state  SR,  however,  it 
refuses  the  deletion  by  replying  with  A-*-  and  changing  to  state  S7.  The  reason  for 
this  Is  thst  state  SR  Indicates  the  slave  has  sent  an  object  reference  to  the  mas- 
ter, and  there  Is  no  way  (in  tnis  simpie  finite-state  model)  of  telling  whether  the 
reference  was  received  before  the  master  sent  the  - message.  If  it  was  received 
sfterward,  the  master,  though  still  in  state  X7,  is  once  again  In  possession  of  s 
reference  to  the  object,  and  allowing  the  link  to  be  deleted  might  result  in  parti- 
tioning the  reference  tree.  To  be  safe,  then,  the  deletion  is  refused  and  a short 
hand-shaking  phase  is  entered. 

When  the  A*  reply  Is  received  by  the  old  master  stilt  in  state  X?,  the  old  mas- 
ter will  return  to  state  M and  acknowledge  with  another  A-*-.  This  will  cause  the  old 
slave,  now  In  state  S7,  to  return  to  state  $.  By  this  mechanism,  its  record  is 
cleartsed  of  having  been  in  state  SR,  and  unless  further  activity  occurs  the  next 
attempt  to  delete  the  link  will  succeed.  If  the  old  slave  sent  another  object  refer- 
ence while  in  state  87,  it  will  have  changed  to  state  SR7  to  record  that  fact. 
Receipt  of  the  A*  will  then  cause  a return  to  state  SR.  State  SR7  is  needed  for 
the  same  reasons  that  SR  Is. 

It  is  possible  that  the  old  maf.'‘er  will  have  been  added  again  to  the  reference 
tree  (over  some  other  link)  befoie  either  the  A'*'  or  A-  reply  is  received,  in  this 
case,  its  state  for  tMs  Mnk  will  have  changed  from  X7  to  N7,  Indicating  that 


102. 


Chapter  3:  Implementations 


although  the  status  on  this  link  Is  still  being  negotiated,  the  final  outcome  must  be 
negative  (to  prevent  cycles).  There  is  no  danger  of  partitioning  the  tree  since  this 
node  Is  now  connected  by  another  path. 

If  the  old  slave  responded  with  A-,  signifying  its  agreement  that  the  link  should 
be  deleted,  there  Is  no  problem.  However,  If  the  response  was  A<f,  the  old  master 
now  in  state  N?  must  quash  the  attempt  to  keep  the  link  In  the  tree.  This  is 
accomplished  by  answering  with  A-  Instead  of  with  A-«-,  as  X7  would.  The  old  slave 
then  goes  to  state  N and  responds  with  another  A-.  This  extra  level  of  ack- 
nowledgment Is  necessary,  as  in  the  earlier  discussion  of  state  N71,  to  insure  that 
a reference  in  the  R-  form  can  be  sent  at  any  time  from  the  old  slave,  up  until  it 
receives  the  A-  message  and  changes  to  state  N. 

We  have  seen  how  the  membership  protocol  operates  to  maintain  the  con- 
nected, acyclic  nature  of  reference  trees.  We  now  turn  to  a mechanism  for  manag- 
ing object  texts. 

3,3.2.2:  Changes  in  Object  Text  Custody 

The  management  of  object  text  custody  has  two  basic  goals:  to  insure  that 
no  object  text  Is  "lost,”  (l.e.,  to  Insure  that  at  least  one  processor  has  custody  of 
an  objects  text  at  all  times),  and  to  keep  each  processor  In  the  reference  tree  for 
an  object  apprised  of  the  directions  In  which  it  may  send  Inquiries  requesting  a 
copy  of  the  text.  There  Is  an  additional  Issue  regarding  mutable  objects  such  as 
ceNs  and  tokens— making  sure  that  changes  In  the  text  are  visible  to  the  appropri- 
ate agents  at  the  appropriate  times. 

We  discuss  first  the  means  by  which  processors  can  obtain  copies  of  the  text 
of  an  object.  Each  processor  in  the  reference  tree  for  the  object  must  keep  three 
bits  of  Information  specific  to  that  object  for  each  link  to  that  processor.  (This  is 


Section  a.a.2.2:  Changes  In  Object  Text  Custody 


103. 


In  addition  to  the  record-lceeplng  required  for  the  membership  protocol  described  In 
the  preceding  section.)  The  first  of  these  bits  (the  "text-this-way"  bit)  indicates 
whether  a copy  of  the  text  may  be  reached  by  following  some  path  in  the  refer 
ence  tree  starting  with  that  link.  If  this  bit  Is  not  set,  It  is  clearly  fruitless  to  send 
an  Inquiry  for  a text  In  this  direction.  The  second  of  these  bits  Indicates  whether 
any  Inquiry  has  been  received  over  that  link  (and  not  yet  satisfied).  The  Inquiry- 
received  bit  is  used  for  routing  replies  to  Inquiries  that  had  to  be  forwarded  to  be 
satisfied.  It  is  also  used  to  prevent  forwarding  an  Inquiry  back  to  its  sender  even 
though  there  might  be  a copy  df  the  text  in  that  directlon-FIgure  3.8  shows  an 
example  of  how  this  aik|ht  happen. 


Proper  Forward 1 ng 


lapropar  ForMarding 
PIguro  3.8:  Inquiry  received  from  direction  of  text 


The  third  bit  Indicates  vrhether  any  Inquiry  regarding  that  object  has  been  sent 
over  that  link  and  not  yet  satisfied.  This  might  be  used  to  aid  in  retransmitting 
apparently  loet  inquiry  messages,  but  Its  real  purpose  Is  to  streamline  the  Inquiry 
process  while  preventing  deadlock.  If  an  inquiry  is  received  over  some  link  and  an 
Inquiry  for  the  text  Is  already  outstanding  over  some  other  link,  then  It  Is  not 
necessary  to  send  another  Inquiry;  «vhen  the  reply  to  the  first  Inquiry  arrives.  It 
may  be  used  to  satiafy  the  latest  inquirer  as  well  as  any  previous  ones.  If  an 
Inquiry  la  received  over  the  same  link  to  «vhlch  an  outstanding  inquiry  h«w  been 
sent,  however,  the  new  Inquiry  must  be  forwarded,  and  over  some  link  other  than 


104. 


Chapter  3:  Implementations 


the  one  on  which  it  was  received.  Otherwise  a deadlock  situation  could  result  as 
in  Figure  3.9,  where  each  processor  could  decide  to  ignore  the  other’s  inquiry  since 
It  had  already  sent  one. 


Figure  3.9;  Possible  deadlock  in  Inquiries 

There  are  two  reasons  for  a processor  that  does  not  have  a copy  of  the  text 
of  an  object  to  send  an  Inquiry:  either  the  text  Is  needed  by  some  computation 
occurring  on  that  processor,  or  an  Inquiry  for  the  text  has  been  received  from  some 
other  processor.  In  either  case,  the  strategy  is  as  follows.  If  an  inquiry  for  that 
text  has  aiteady  been  sent  from  this  processor,  simply  await  the  reply  to  that 
Inquiry  (except  as  discussed  above).  If  no  inquiry  has  been  sent,  send  one  to  any 
neighbor  which  has  a copy  of  the  text  in  his  direction.  Of  course.  If  the  inquiry 
must  be  sent  because  of  an  inquiry  received,  the  Inquiry  sent  should  not  be  sent 
back  in  the  same  direction,  even  if  there  is  a copy  of  the  text  In  that  direction  (If 
an  inquiry  was  sent  in  our  direction,  there  will  always  be  another  link  which  Is  part 
of  the  path  from  the  source  of  the  inquiry  to  the  copy  of  the  text  that  the  sender 
had  in  mind). 

If  custody  of  a text  changes  (for  example,  in  response  to  an  Inquiry),  a copy 
of  the  text  will  in  general  be  sent  along  some  link.  As  we  have  already  discussed, 
this  may  result  in  changes  to  the  reference  trees  not  only  of  the  object  whose 
text  ie  being  moved,  but  of  any  objects  directly  referenced  from  the  text.  AddF 


Section  3.3.2.2:  Changes  In  Object  Text  Custody 


106. 


tionally  the  bits  pertaining  to  the  iocation  of  texts  in  the  reference  tree  must  be 
updated  properly.  The  Inquirr-recelved  and  inquiry-outstanding  bits  are  set  and 
cleared  at  the  obvious  tiroes,  but  the  treatment  of  the  text-this-way  bit  deserves 
closer  attention. 

When  a copy  of  the  text  of  an  object  is  sent  from  a processor  over  a link, 
obviously  there  wni  then  be  a text  in  that  direction  until  fur^r  notice.  Conse- 
quently, the  text-this-way  bit  should  be  set  for  that  link,  regardless  of  Its  state 
before.  If  a processor  receives  a copy  of  a text  over  some  link,  It  is  not  clear 
whether  this  was  the  only  copy  of  that  text  in  that  part  of  the  reference  tree,  or 
whether  there  are  stHi  other  copies  in  that  direction.  This  information  must  be  con- 
tained in  the  text  message.  Thus  if  a processor  wishes  to  send  a text  and  also 
keep  a copy  for  itMlf,  It  should  sat  the  bit  in  the  text  message.  If  the  sender  is 
not  keeping  a copy,  and  no  other  copies  exist  in  the  sender’s  part  of  the  refer- 
ence tree  (reachable  via  links  other  than  the  one  over  which  the  text  message  is 
being  sent),  then  the  bit  in  the  message  should  be  cleared. 

This  level  of  protocol  is  sufficient  to  correctly  maintain  the  text  management 
bits  except  in  one  set  of  circumstances:  when  two  processors  simultaneously 
send  copies  of  the  same  text  to  each  other,  and  one  (or  both)  of  the  messages 
has  Its  text-this-way  bit  clear.  In  this  case,  the  processor(s)  receiving  the 
message(s)  with  the  bit  clear  would  "forget”  that  a text  had  been  sent  to  the 
other  proce88or(s),  and  consequently  would  have  an  incorrect  picture  of  the  loca- 
tion of  texts  In  the  reference  tree.  This  difficulty  is  avoided  by  requiring  a proces- 
sor to  be  master  of  a link  (with  respect  to  the  object  whose  text  is  being  sent) 
before  the  text  of  an  object  is  sent  over  that  link.  In  practice,  this  is  not  a 
difficult  condition  to  observe,  although  It  does  mean  that  occasionally  the  response 
to  an  inquiry  must  be  further  delayed  until  the  responder  can  obtain  mastery  of  the 


too. 


Chapter  3:  Implementations 


r 

I 

I 

link  to  be  used  for  sending  the  requested  text. 

3.3.3:  Garbage  Collection 

So  far  in  this  section  we  have  been  discussing  the  creation  and  maintenance 
of  objects  In  a distributed  system.  In  any  system  with  limited  resources,  another 
aspect  of  object  management  Is  also  Important:  the  Identification  and  disposal  of 
objects  that  will  never  be  used  again.  It  is  possible  to  Imagine  a system  where 
the  programmer  Is  required  to  explicitly  deallocate  such  objects-such  a discipline 
is  standard  in  many  current  programming  systems.  In  message-passing  computation, 
however,  and  especially  in  the  style  we  should  like  to  encourage  as  making  the 
best  use  of  a large  distributed  system,  this  can  be  quite  inconvenient.  (Consider, 
for  example,  being  required  to  notify  the  system  upon  the  last  use  of  every  actor 

t 

created  r'uring  a message-passing  computation.)  We  are  thus  motivated  to  explore 
I schemes  for  the  automatic  garbage  collection  of  Inaccessible  objects. 

^ To  date,  there  have  not  been  many  attempts  to  solve  the  problem  of  garbage 

collection  In  distributed  systems.  One  piece  of  work  that  seems  related  is  by 
- Peter  Blshop[4],  where  he  discusses  the  concept  of  garbage  collection  by  areas. 

Records  are  kept  of  references  across  area  boundaries,  and  these  /nter-area  links 
are  used  to  prevent  any  objects  referenced  only  from  other  areas  from  being  col- 
lected. Various  researchers[1 2]  have  noticed  that  this  garbage-collection  scheme 
i seems  to  have  most  of  the  properties  that  would  be  appropriate  in  a distributed 

garbage  collector,  defining  each  processor  as  containing  one  or  more  areas.  We 
shall  not  look  at  garbage  collection  from  this  viewpoint,  preferring  to  integrate  gar- 
bage collection  Into  the  reference-tree  approach  we  have  been  using.  However,  in 
the  end.  It  will  turn  out  that  our  scheme  bears  several  deep  resemblances  to 
Bishop’s-not  surprising  since  both  must  share  all  functions  essential  to  the 

Section  3.3.3:  Garbage  Collection  107. 

f 

J 

I 

I 

i 


garbag«-coll«ctlon  procMS. 

When  an  object  becomes  inaccessible,  It  will  in  general  have  become  known  on 
several  processors,  in  other  words,  Its  reference  tree  wilt  have  spread  across 
some  part  of  the  system.  None  of  these  processors  can  take  the  Initiative  to 
delete  the  object  outright  because  none  In  general  knows  whether  references  to 
the  object  exist  on  other  processors.  Therefore,  It  seems  that  It  might  be  very 
dlfhcult  to  ever  reclaim  the  object.  If  the  object  is  only  known  on  one  processor, 
the  story  Is  different.  In  this  case,  It  is  obvious  that  no  references  to  the  object 
exist  on  other  processors  (else  the  reference  tree  would  be  larger)  and  therefore 
the  object  can  be  deleted  if  it  is  not  referenced  on  the  one  processor  where  it  is 
known. 

Our  garbage-collection  scheme  works  by  shrinking  the  reference  tree  of  an 
object  to  be  collected  until  only  one  processor  knows  about  the  object.  At  that 
point,  the  object  can  be  collected  by  traditional  moans.  In  order  for  a reference 
tree  to  shrink,  nodes  must  remove  themselves  from  It.  The  sequences  of  messages 
exchanged  In  the  process  of  breaking  a link  have  already  been  discussed,  but  the 
circumstances  under  which  a link  may  be  broken  without  harmful  effect  have  not. 

Clearly,  any  node  which  has  more  than  one  neighbor  in  the  reference  tree  can- 
not unilaterally  remove  Itself-if  it  did,  the  tree  would  become  partitioned,  since 
those  nodes  which  were  originally  connected  by  the  removed  node  would  now  have 
no  means  of  communicating.  Thus  only  "leaf"  nodes-nodes  which  have  exactly  one 
neighbor  also  in  the  reference  tree-may  disconnect  themselves  from  It.  For- 
tunately, since  reference  trees  are  acyclic,  every  reference  tree  has  leaf  nodes. 
Arwther  requirement  Is  that  some  processor  in  the  tree  have  custody  of  a copy  of 
the  text  of  an  object  as  long  as  that  object  exists.  Therefore,  If  a node  attemp^ 
ing  to  remove  Itaelf  from  an  object’s  reference  tree  is  the  sole  custodian  of  a text 


108. 


Chepter  3:  laiplementations 


for  the  object  (as  can  be  determined  by  looking  to  see  If  there  is  another  custo* 
dian  In  any  direction),  it  must  first  preserve  the  text  by  passing  It  on  to  its  neigh- 
bor before  breaking  the  link.  Since  both  sending  a text  and  breaking  a link  require 
a processor  to  be  master  of  the  link,  once  the  text  is  sent  to  the  neighbor  It  can 
be  considered  preserved,  provided  the  node  attempting  to  delete  Itself  does  not 
relinquish  Its  mastery  of  the  link  between  sending  the  text  and  breaking  the  link. 

The  garbage-collaction  scheme  presented  here  depends  on  the  fact  that  a 
garbage-collactable  object  will  not  be  used  anywhere  once  it  becomes  garbage- 
collectable.  Thus,  after  some  Interval,  processors  with  references  to  a garbage- 
collectable  object  may  guess  that  It  can  be  collected  by  the  fact  that  It  has  not 
been  used  recently  on  those  processors.  Even  If  the  object  is  still  potentially 
accessible.  It  Is  only  taking  up  space  If  kept  on  several  processors  where  It  is  not 
being  used*.  Therefoi^,  It  seems  economical' for  a processor  with  a reference  to 
such  an  object  to  remove  itself  from  the  object’s  reference  tree  if  it  is  able  (i.e..  If 
it  is  a leaf  node  in  the  tree).  By  consistently  applying  such  a strategy,  the  refer- 
ence tree  of  any  garbage-collectable 'Object  should  slowly  shrink  to  a point  (a  sin- 
gle node),  whereupon  the  object  can  be  garbage-collected. 

There  is  one  unfortunate  problem  with  this  pretty  pleture-the  problem  involvea 
collecting  objects  which  are  part  of  cyclic  data  structures.  For  example,  consider 
an  object  A whose  text  contains  a reference  to  B,  whose  text  In  turn  contains  a 
reference  to  A.  Assume  further  that  the  structure  Is  garbage-collectable-that  nei- 
ther A nor  B can  be  reached  from  any  ongoing  computation.  Then  by  the  argument 
given  above,  the  reference  trees  for  both  A and  B should  slowly  shrink  to  a point. 
If  both  converge  to  the  same  point,  there  is  no  problem:  ordinary  garbage- 
collection  techniques  can  easily  handle  the  situation.  However,  another  scenario  is 
possible,  as  outlined  In  Figure  3.10.  Here  the  two  objects  may  spend  forever 


Section  3.3.3:  Garbage  Collection 


100. 


chasing  each  others’  talla,  and  it  may  be  that  Mlther  reference  tree  will  ever 
shrink  to  a point.  The  reason  this  can  happen  is  that  when  a text  is  moved  from 
one  processor  to  another  It  draws  with  it  the  reference  trees  for  all  objects  refer- 
enced in  that  text.  Thus  when  the  reference  tree  for  object  A shrinks  and  the 
text  of  A moves  from  processor  1 to  processor  2,  the  reference  tree  for  B will  be 
extended  by  the  addition  of  a link  from  processor  1 to  processor  2.  If  the  refer- 
ence tree  for  B attempts  to  contract  next  by  the  removal  of  processor  3,  the  text 
of  B will  have  to  be  eent  from  3 to  1,  re-extending  the  reference  tree  of  A to 
include  processor  1. 


Soltd  //nee  d^notm  link*  In  th*  reference  tree  of  object  A,  dashed  lines  the  tree 
for  otjeet  R.  4 solid  box  represents  a processor  with  a text  for  A,  a shaded  box  a 
processor  with  a text  for  B.  Successive  referetxie  tree  contractions,  alternatins 
between  the  reference  trees  of  A and  B,  can  lead  to  the  sequence  of  situations 
shown  above  as  (a)  through  (f),  wAerec/pon  a hns!  contraction  Involving  B will  re~ 
store  situation  (a). 


Figure  3.10:  Cyclic  restart  In  garbage  collection 


Of  course,  there  are  many  other  sequences  of  events,  even  starting  from  one  of 
the  configurations  shown  In  Figure  3.10,  which  will  result  in  both  reference  trees 


110. 


Chapter  3:  Implementations 


1 

i 

j converging  on  the  same  point;  however,  It  Is  possible  to  have  an  Infinitely  long 

I sequence  of  events  which  never  results  in  either  object  being  collected. 

I 

This  problem  Is  similar  to  the  problem  of  cyclic  restart  In  some  transaction- 
based  data  base  management  systems[28];  perhaps  there  are  solutions  to  this 
garbage-collection  problem  which  are  analogous  to  solutions  to  the  cyclic  restart 
problem.  For  a practical  system,  It  should  be  quite  safe  to  rely  on  random  timing 
differences  due  to  changing  loads  to  prevent  any  continuing  pattern  such  as  shown 
in  Figure  3.10  from  persisting  for  long. 

3.3.4:  Management  of  Mutable  Objects 

Immutable  objects  ere  the  most  trouble-free  objects  to  deal  with  in  a distri- 
buted system;  consequently,  much  of  the  foregoing  discussion  has  Implicitly  been 
biased  toward  the  management  of  Immutable  objects  and  toward  taking  advantage 
of  the  extra  flexibility  these  objects  allow.  This  section  alms  to  restore  the  bal- 
ance by  highlighting  the  ways  In  which  the  strategies  that  have  been  presented 
are  relevant  to  mutable  objects,  and  to  comment  on  some  interesting  alternatives  in 
the  application  of  these  strategies  to  mutable  objects. 

Mutable  objects  are  objects  such  as  cell,  token,  and  semaphore  bodies  whose 
texts  may  change  over  time.  The  obvious  and  simplest  way  to  manage  one  of 
these  objects  is  to  keep  only  one  copy  of  Its  text,  rather  than  allowing  multiple 
copies.  Using  this  strategy,  a processor  sending  a text  for  one  of  these  objects 
would  not  retain  a copy  for  Itself,  as  It  ordinarily  might. 

Section  3.3.4:  Management  of  Mutable  Objects  111. 


S.S.4.1:  Man«o*m*nt  of  Tokons 

A stratogy  for  managing  tokens  which  is  more  sophisticated  than  simply  keep- 
ing all  Information  about  a token  in  one  place  has  already  been  suggested.  Pieces 
of  the  text  may  be  kept  In  separate  places,  so  long  as  no  piece  is  ever  kept  in 
more  than  one  place.  Thus  the  complete  text  of  the  token  may  not  necessarily 
exist  at  any  one  location,  but  exists  as  the  unton  of  all  the  pieces  of  text  for  that 
token  scattered  about  the  system. 

The  basic  approach  is  to  represent  the  text  of  a token  as  a pair  of  tables  as 
shown  in  Hgure  2.7.  Actually,  only  one  table  need  be  kept,  with  two  kinds  of 
entries.  Every  time  the  read  side  of  a token  receives  an  object,  an  RTOK  entry 
naming  the  object  is  added  to  the  table  for  that  token.  Whenever  the  write  side 
receives  an  object,  a WTOK  entry  naming  the  object  Is  added  to  the  table.  This 
table  Is  the  text  of  an  object  which  we  shall  call  the  token  body.  In  fact,  as 
described  in  the  previous  paragraph,  the  RTOK  and  WTOK  entries  go  not  into  one 
centralized  token  table  but  (at  least  initially)  into  the  portion  of  the  table  present 
on  the  processor  where  the  object  was  received. 

The  RTOK  and  WTOK  entries  satisfy  the  record-keeping  requirements  for 
tokens,  but  it  is  difficult  to  use  these  entries  by  themselves  to  generate  the 
events  that  should  be  generated  when  both  the  read  and  write  sides  of  a token 
have  received  messages,  perhaps  on  different  processors.  Without  creating  dupli- 
cate RTOK  or  WTOK  entries  on  other  processors  that  also  have  pieces  of  the  text 
of  a token  body,  those  processors  must  be  notified  whenever  a new  RTOK  or  WTOK 
entry  is  made,  so  that  the  proper  events  can  be  generated.  For  this  purpose,  two 
other  kinds  of  table  entries  are  required.  Whenever,  for  example,  an  RTOK  entry  Is 
added  to  a token  body,  an  entry  of  type  WREQ  (request  writers)  naming  the  same 
object  is  added  also.  Similarly,  addition  of  a WTOK  entry  prompts  the  addition  of  an 


112. 


Chapter  S:  Imptementatlons 


entry  of  type  RREQ.  Unlike  RTOK  and  WTOK  entries,  WREQ  and  RREQ  entries  are 
designed  to  spread  to  all  parts  of  the  text  of  a token  body.  Thus  once  one  of 
these  entries  has  been  added  to  the  token  body  text  on  one  processor,  that  pro- 
cessor will  send  it  to  all  of  its  neighbors  who  have  pieces  of  text  for  that  token 
body,  they  will  send  it  to  all  their  neighbors  who  qualify,  and  so  on.  Every  time  a 
WREQ  entry  Is  added  to  a token  table,  an  event  is  generated  for  each  WTOK  entry 
that  was  already  present;  every  time  an  RREQ  entry  Is  added,  an  event  Is  gen- 
erated to  correspond  to  each  RTOK  entry  already  present. 

Since  they  never  interact  with  any  entries  added  to  a table  after  them,  WREQ 
and  RREQ  entries  can  actually  be  temporary.  The  only  reason  they  need  to  stay 
around  even  temporarily  is  to  give  the  processor  a chance  to  forward  them  to  all 
Its  neighbors.  This,  unfortunately,  cannot  always  be  done  immediately.  Since  WREQ 
and  RREQ  entries  are  In  reality  pieces  of  token  body  text,  they,  like  any  other 
text,  may  only  be  sent  over  a link  when  the  sender  Is  master  of  the  link.  Thus 
these  entries  may  have  to  be  stored  temporarily,  and  should  Include  a field  Indicat- 
ing which  neighbors  they  have  not  yet  been  sent  to.  Once  a WREQ  or  RREQ  entry 
has  been  sent  to  all  neighbors  having  part  of  the  text  for  the  token  body,  however, 
it  can  be  deleted. 

Figure  3.12  illustrates  a possible  sequence  of  events  for  a token  known  on 
three  processors  PI,  P2,  and  PS,  which  form  a chain  in  which  P2  lies  between  PI 
and  P3  (shown  in  Figure  3.11).  The  horizontal  lines  separate  snapshots  of  the 
state  of  the  various  token  tables  through  time.  The  column  labeled  "Event"  shows 
any  events  generated  from  the  token  at  that  time.  The  scenario  Is  that  the  read 
side  of  the  token  receives  object  X on  processor  P2  at  time  T1,  and  the  write  side 
receives  object  Y on  processor  PI  at  time  Td. 

In  more  detail,  the  receipt  of  object  X on  processor  P2  by  the  read  side  of  the 


Section  3.3.4. 1:  Management  of  Tokens 


113. 


PI  P2  P3 


Flour*  3.11:  Proe*«sor  configuration  for  token  examples 


r/me 

PI 

Processor 

P2 

P3 

Event 

' T1 

RTOK  X 
WREQ  X 

T2 

RTOK  X 
WREQ  X 

WREQ  X 

T3 

RTOK  X 
WREQ  X 

T4 

WREQX 

RTOK  X 

T6 

RTOK  X 

Te 

WTOK  Y 
RREQ  Y 

RTOK  X 

T7 

WTOK  y 

'•’■OK  X 
RREQy 

XY 

T8 

wrote  y 

RTOK  X 

RREQ  y 

TS 

WTOK  y 

RTOK  X 

Figure  3.12:  Example  of  token  operation 


token  oauaes  the  "RTOK  X"  and  "WREQ  X"  entries  to  be  added  to  the  token  table 
there,  according  to  the  algorithm  that  has  been  outlined.  At  time  T2,  the  WREQ 
entry  has  been  sent  to  P3,  but  must  still  be  kept  at  P2  since  it  has  not  yet  been 
possible  to  send  It  to  PI.  At  time  T3,  the  WREQ  entry  has  been  deleted  at  P3 
since  it  has  no  neighbors  to  send  it  to  (there  is  no  need  to  send  it  to  P2  because 
that  is  vrhere  It  easM  from!).  At  T4,  the  WREQ  entry  has  finally  been  sent  to  PI 


114. 


Chapter  3:  Impiseientatlona 


and  thus  may  be  deleted  at  P2.  Skipping  to  T6,  we  see  that  the  write  side  of  the 
token  has  received  the  object  Y at  PI.  By  T7  the  RREQ  entry  generated  by  this 
occurrence  has  moved  to  P2,  where  it  interacts  with  the  previously  added  RTOK 
entry  to  cause  the  event  XY.  After  T7,  the  RREQ  entry  moves  on  to  P3  and  finally 
out  of  the  system,  leaving  only  the  "RTOK  X"  and  "WTOK  Y"  entries  as  permanent 
records  of  this  activity.  These  will  come  Into  action  In  the  future  If  additional 
objects  are  received  by  either  the  read  or  write  side  of  the  token. 

As  It  has  been  described  up  to  now,  the  Implementation  has  a bug,  depicted  in 
Figure  3.13. 


PI 

Processor 

P2 

P3 

Event 

T1 

RTOK  X 
WREQX 

WTOK  Y 
RREQ  Y 

T2 

RTOK  X 

WREQX 
RREQ  Y 

WTOK  Y 

T3 

RTOK  X 
RREQ  Y 

WREQX 

WTOK  Y 

XY 

74 

RTOK  X 

WREQ  X 

WTOK  Y 

76 

RTOK  X 

WTOK  Y 
WREQX 

XY 

76 

RTOK  X 

WTOK  Y 

Figure  3.13:  Incorrect  token  operation 


This  situation  can  result  If  a token  receives  an  object  at  Its  read  side  on  one  pro- 
cessor and  an  object  at  its  write  side  on  another  processor  simultaneously,  or  at 
least  cloee  enough  in  time  that  the  RREQ  or  WREQ  messages  generated  by  the  first 
event  have  not  yet  had  time  to  propagate  to  all  processors  having  pieces  of  text 
for  that  token.  For  example.  In  Figure  3.13,  X is  received  at  processor  PI  by  the 
read  side  at  the  same  time  as  Y Is  received  at  PS  by  the  write  aide.  The  reeultant 


Section  3.3.4. 1:  Management  of  Tokens 


116. 


machinations  cause  the  event  Xf  to  be  generated  twice,  when  in  fact  it  should  be 
generated  only  once.  Devising  a solution  to  this  problem  requires  taking  a closer 
look  at  exactly  what  functions  the  various  parts  of  our  implementation  ore  sup- 
posed to  serve. 

The  RTOK  and  WTOK  entries  are  Intended  to  fullMI  the  record-keeping  require- 
ments of  our  system.  They  correspond  to  the  table  entries  In  Figure  2.7,  summariz- 
ing the  history  of  the  token  so  that  It  will  be  known  what  events  to  generate  In  the 
future.  WREQ  and  RREQ  are  messengers,  signs  of  activity  In  the  token  table- 
specIRcally,  indleetors  that  a new  RTOK  or  WTOK  entry  has  been  added.  The  sole 
purpose  of  the  WREQ  and  RREQ  entries  is  to  interact  with  all  prwiously  added 
WTOK  or  RTOK  entries  to  generate  the  appropriate  events.  A WREQ  or  RREQ  entry, 
however,  ehouM  not  Interact  wtth  a ouboequenUy  added  WTOK  or  RTOK,  because 
the  RREQ  or  WREQ  resulting  from  the  addition  of  the  latter  will  Interact  with  the 
RTOK  or  WTOK  correaponding  to  the  former.  Thus  we  see  that  the  success  of  our 
scheme  hinges  on  the  specification  of  a strict  ordering  on  all  RTOK  and  WTOK 
entries  so  that  It  can  be  determined  which  were  added  prior  to  any  given  RREQ  or 
WREQ  entry  and  which  may  have  been  added  subsequently. 

The  solution  adopted  In  this  case  was  to  tlme^stamp  all  token  table  entries 
using  a scheme  suggested  by  Lamport[18].  It  is  not  necessary  to  use  an  external 
source  of  time  for  thie-any  scheme  for  time-stamping  a token  will  be  satisfactory  if 
It  yields  time  stamps  which  show  no  inconsistency  (failure  to  increase  monotoni- 
caRy)  observable  in  a context  where  the  messages  relating  to  the  token  body  are 
the  only  messages  traveling  between  processors.  Our  scheme  involves  keeping  at 
the  head  of  each  token  table  on  each  processor  a current  value  of  Its  time 
stamp-note  that,  even  for  the  same  token,  the  time  stamp  values  need  not  be  the 
same  on  all  processors  at  all  times.  Every  time  a new  RTOK  or  WTOK  entry  Is 


110. 


added  to  a table,  that  table’s  time  stamp  Is  first  Incremented  and  the  new  value 
becomes  the  time  stamp  both  of  the  RTOK  or  WTOK  entry  and  of  the  corresponding 
WREQ  or  RREQ  entry.  Every  time  a table  entry  Is  sent,  the  time  stamp  of  the  entry 
is  sent  with  It,  and  the  time  stamp  of  the  table  it  was  sent  from  is  also  sent. 
Every  time  a table  entry  Is  received,  the  time  stamp  of  the  table  on  the  receiving 
processor  Is  set  to  the  maximum  of  Its  old  time  stamp  and  the  table  time  stamp 
from  the  message.  The  time  stamp  of  the  new  entry  In  the  table  is  Just  the  time 
stamp  of  the  entry  In  the  message.  One  unfortunate  problem  with  time  stamps  Is 
their  propensity  to  overflow  after  a while.  As  with  eventcounts[24]  a viable, 
though  not  elegant,  solution  Is  Just  to  allocate  enough  bits  for  storage  of  time 
stamps  that  a system  could  operate  for  a long  time  with  no  overflow. 

Now  a WREQ  or  RREQ  entry  being  added  to  a table  will  only  interact  with  WTOK 
or  RTOK  entries  whose  time  stamps  are  earlier  than  that  of  the  WREQ  or  RREQ 
entry.  Ties  are  prevented  by  using  the  unique  ID  of  the  processor  where  an  entry 
was  added  as  the  least  significant  bits  of  the  entry’s  time  stamp. 

Figure  3.14  is  a replay  of  the  scenario  In  Figure  3.13  showing  how  the  addition 
of  time  stamps  solves  the  problem  demonstrated  there.  The  time  stamp  of  a table 
is  shown  In  square  brackets  to  the  left  of  the  table;  the  time  stamp  of  each  entry 
is  shown  following  it  In  the  form  "(t/mestamp, processor  ID)."  Although  some  inter- 
mediate steps  have  been  shown,  the  labels  T1  through  T6  have  been  chosen  to 
label  the  situations  corresponding  to  similarly  labeled  situations  in  Figure  3.13. 
Other  than  the  machinations  of  time  stamps  at  work,  the  only  difference  between 
the  two  scenarios  is  that  no  event  is  generated  at  T6  when  time  stamps  are  used. 
This  is  because  the  time  stamp  of  (2,P3)  on  the  WTOK  entry  is  later  than  the  time 
stamp  of  (2,P1)  on  the  WREQ  entry  at  processor  P3  (this  assumes  a collating 
sequence  In  which  P3  > PI -the  reverse  assumption  is  also  tenable  and  would 


Section  3.3.4.1 : Management  of  Tokens 


117. 


77/ne 


PI 


Processor 

P2 


P3 


Event 


r 

i: 


[1] 

[1] 

[1] 

[2]  RTOK  X(2,P1) 
WREQ  X(2,P1) 

[1] 

[1] 

T1 

[2]  RT0KX(2^1) 

WREQ  X(2,P1) 

[1] 

[2]  WTOK  y(2,P3) 
RREQ  y(2,P3) 

[2]  RTOK  X(2,P1) 

[2]  WREQ  X(2,P1) 

[2]  WTOK  y(2,P3) 
RREQ  y(2,P8) 

T2 

[2]  RT0KX(2,P1) 

[2]  WREQX(2,P1) 
RREQ  Y(2,P3) 

[2]  WTOK  y(2,P3) 

T3 

[2]  RTOK  X(2,P1) 
RREQ  y(2.P3) 

[2]  WREQX(2,P1) 

[2]  WTOK  y(2,P3) 

xy 

T4 

[2]  RT0KX(2,P1) 

[2]  WREQX(2,P1) 

[2]  WTOK  y(2,P3) 

T5 

[2]  RTOK  X(2,P1) 

[2] 

[2]  WTOK  y(2,P3) 
WREQ  X(2.P1) 

T6 

[2]  RTOK  X(2,P1) 

[2] 

[2]  WTOK  y(2,P3) 

Figure  3.14:  Corrected  token  operation 


result  in  an  event  at  T6  but  none  at  T3).  Due  to  space  limitations,  this  example 
does  not  show  any  very  exotic  Instance  of  time  stamps  at  work;  the  reader  is 
Invited  to  corKXict  his  own  scenarios  and  see  how  time  stamps  would  handle  them. 

This  completes  our  discussion  of  the  minimum  required  to  make  tokens  work, 
but  there  are  stHI  improvements  that  will  make  the  use  of  tokens  more  feasible.  It 
has  probably  already  occurred  to  the  reader  that  the  table  analogy  has  some  llablil- 
tlas  as  a basis  for  an  Implementation.  There  are  cases  where  It  Is  neither  neces- 
sary nor  desirable  for  a token  to  remember  every  object  ever  sent  to  It.  For  exam- 
ple, when  a token  is  used  to  Implement  recursion,  as  In  the  operator  of  Chapter 
2,  the  write  side  of  the  token  will  receive  a message  once  and  never  be  used 
again,  wMIe  the  read  side  will  receive  another  message  every  time  the  recursive 
function  Is  caMed.  If  tokens  are  Implemented  using  tables,  the  tables  will  grow  and 


j 

i 


j 


118. 


Chapter  3:  Implementations 


grow  as  long  as  the  token  Is  in  use.  This  Is  clearly  undesirable,  since  alternative 
strategies  for  recursion  manage  to  get  along  without  tying  up  unbounded  amounts 
of  storage  In  this  way.  It  is  also  unnecessary;  the  history  of  objects  sent  to  the 
read  side  will  never  be  used  because  they  could  only  come  into  play  if  a message 
was  received  by  the  write  side,  but  this  will  never  happen.  Thus  the  only  record 
that  must  be  kept  in  this  case  Is  of  the  object  that  was  sent  to  the  write  side, 
which  will  Interact  with  each  new  object  sent  to  the  read  side. 

The  fact  that  the  read  table  need  no  longer  be  kept,  and  that  any  current 
RTOK  entries  can  actually  be  deleted,  can  even  be  discovered  by  the  system  in 
most  cases.  In  the  example  of  the  operator,  the  reason  we  are  sure  that  the 
write  side  of  the  token  will  never  receive  another  message  is  that  it  becomes  inac- 
cessible shortly  after  receiving  Its  first  message.  If  this  Is  true,  the  object 
representing  the  write  side  of  the  token  (recall  that  this  Is  distinct  from  the  read 
side  and  also  from  the  token  body)  will  eventually  be  garbage-collected.  When  the 
write  side  Is  deleted  on  the  last  processor  having  any  references  to  It,  the  gai^ 
bage  collector  (which  can  easily  detect  this  event)  can  send  a broadcast  message 
notifying  all  processors  having  texts  for  the  token  body.  This  can  be  done  using 
the  same  mechanism  by  which  RREQ  and  WREQ  entries  are  "broadcast"  to  all  such 
processors.  Receipt  of  this  notification  will  cause  the  deletion  of  all  RTOK  entries 
and  a state  change  preventing  new  RTOK  entries  from  being  added  In  the  future. 
This  deletion  Is  always  safe,  provided  that  the  garbage  collector  notification  is  not 
allowed  to  "pass  through"  RREQ  and  WREQ  entries  awaiting  transmission.  Of 
course,  all  the  mechanism  described  In  this  paragraph  applies  symmetrically  to  the 
case  where  the  last  reference  to  a read  side  is  deleted  and  the  write  side  Is  still 
active. 

Although  this  Imptementatlon  of  tokens  has  not  been  proven  correct.  It  has 


Section  3.3.4. 1:  Management  of  Tokens 


no. 


been  coded  and  exercised  In  a situation  which  seems,  from  other  experience,  to 
have  been  able  to  produce  almost  any  pathological  sequence  of  events  Imaginable. 
Thus  there  Is  reason  to  be  confident  that  any  bugs  In  the  scheme  described  In  this 
section  are  the  result  of  fiaws  In  the  author’s  descriptive  talents  rather  than  signs 
of  any  underlying  weakness  in  the  scheme. 

3.3.4.2:  Management  of  Cells 

Along  with  tokens,  cells  can  be  managed  more  imaginatively  than  Just  by  allow- 
ing a maximum  of  one  copy  of  the  text.  Consider,  for  example,  a ceil  which  con- 
tained a number  indicating  the  current  year.  Such  a cell  could  be  thought  of  as  an 
immutable  object  for  long  periods  of  time,  with  occasional  lapses  of  mutability 
around  New  Year's  Day.  Similar  comments  could  be  made  regarding  much  of  the 
data  stored  In  conventional  file  systems,  such  as  commands  and  subroutines  pro- 
vided by  the  system  staff  (at  least  once  they  seem  to  work!). 

It  would  be  pleasing  to  treat  these  as  immutable  objects  during  the  periods 
when  they  are  not  changing,  yet  retain  the  capability  to  change  them  if  appropri- 
ate. Happily,  our  system  Is  fairly  well  set  up  to  do  Just  this.  It  is  certainly  simple 
to  treat  a call  body  as  If  It  were  Immutable,  allowing  any  number  of  copies  of  it  to 
be  made.  The  problems  start  whan  the  cell  must  be  updated.  Fortunately,  the 
reference  tree  for  the  call  body  links  together  all  processors  having  any  direct 
knowledge  of  the  cell,  so  all  processors  in  possession  of  a copy  of  the  cell  body’s 
text  can  be  located.  Consequently,  a processor  wishing  to  perform  an  update  on  a 
cell  can  broadcast  a message  to  all  other  processors  having  copies  of  the  old  cell 
text,  inducing  them  to  delete  their  copies.  Meanwhile,  of  course,  the  updating  pro- 
cessor should  not  give  out  any  new  copies.  When  acknowledgment  is  received 
that  all  other  copies  of  the  call  body  text  are  gone,  the  update  may  be  performed 


120. 


Chapter  3:  Implementations 


on  what  is  at  that  moment  the  only  copy  of  the  cell  body  text  In  the  system. 

Complications  arise,  however,  If  two  processors  more  or  less  simultaneously 
decide  to  perform  different  updates  to  the  same  cell.  Presumably  all  the  other  pro- 
cessors will  delete  their  copies  of  the  cell  body  text  when  they  are  requested  to, 
but  each  of  the  two  updating  processors  will  hold  fast,  waiting  for  the  other  to 
delete  its  copy  of  the  text-a  deadlock.  This  problem,  like  the  problem  of  too  many 
events  being  generated  from  tokens,  can  be  solved  by  placing  any  strict  ordering 
on  the  update  requests.  One  possibility  would  be  simply  to  use  the  processor  ID  of 
the  processor  making  the  request,  but  this  would  probably  result  In  some  proces- 
sors operating  at  permanent  and  unfair  advantages  or  disadvantages.  A more 
palatable  solution  is  to  have  time  stamps  on  cell  bodies  like  the  time  stamps  on 
token  bodies  and  order  updates  by  these  time  stamps.  Each  update  message 
broadcast  to  all  processors  to  induce  them  to  delete  their  cell  body  texts  will  carry 
the  time  stamp  of  the  update.  If  an  update  message  Is  received  by  a processor 
which  Is  itself  attempting  an  update,  the  time  stamp  of  the  message  is  compared  to 
that  of  this  processor’s  update,  if  less,  this  processor’s  update  attempt  is 
aborted,  to  be  tried  again  later.  If  greater,  the  message  is  ignored.  This  rule 
insures  that  the  update  with  the  oldest  time  stamp  will  have  precedence. 

The  beauty  of  this  scheme  for  handling  cells  is  that,  except  for  the  additional 
overhead  of  maintaining  a time  stamp,  it  includes  as  a special  case  the  obvious 
way  of  managing  cells  by  only  keeping  a single  copy  of  the  text.  It  is  trivial  to  tell 
(by  consulting  the  processor’s  reference  tree  status  bits  for  the  cell  body) 
whether  any  other  processors  have  copies  of  a cell  body  text.  If  not,  the  update 
can  be  performed  locally  Just  as  In  the  single-text  algorithm.  Only  If  other  copies 
exist  must  other  processors  be  Informed.  In  either  case.  Immediately  after  an 
update  Is  performed  there  Is  again  only  one  copy  of  the  text,  which  will  only 


Section  3.3.4.2:  Management  of  Cells 


121. 


spread  to  other  processors  If  it  is  still  in  demand. 


0.3.4.3:  Summary 

As  this  section  has  been  seeking  to  demonstrate,  it  is  possible  by  the  exercise 
of  some  amount  of  Ingenuity  to  come  up  with  alternative  schemes  for  managing  mut- 
able objects  which  preserve  more  of  the  distributed  flavor  of  our  system.  The 
bookkeeping  associated  with  the  reference  tree  mechanism  was  seen  to  be  a 
material  help  in  this  process,  facilitating  the  Job  of  tracking  down  all  other  proces- 
sors with  knowledge  of  a mutable  object  when  a change  must  be  made  to  that 
object. 

Unfortunately,  this  section  also  bears  witness  to  the  sensitivity  of  the  design 
of  these  alternative  schemes  to  dHferences  in  the  attributes  of  the  mutable 
objects  being  implemented.  Not  only  does  the  Implementstlon  of  tokens  differ  con- 
siderably from  that  of  ceHs,  a section  on  how  to  implement  semaphores  urould  con- 
tain yet  another  atf  hoc  scheme  (no  such  section  has  been  Included  because  It 
would  yield  no  new  Insight  into  the  capabilities  of  the  reference  tree  scheme). 

On  the  bright  side,  the  Implementations  described  for  tokens  and  celts  seem 
quite  viable.  Both  reduce  to  their  simplest  cases  when  only  one  processor  knows 
about  the  object  (which  is  likely  to  be  the  case  for  the  majority  of  objects),  seem- 
ing not  to  carry  an  excessive  penalty  for  their  generality  whan  that  generality  is 
not  used.  For  the  most  part,  the  Implementation  of  tokens  avoids  retaining  data  in 
token  tables  that  will  never  be  used,  and  thus  makes  It  reasonable  to  actually  use 
tokens  for  all  the  purposes  discussed  in  Chapter  2.  The  suggested  implementation 
of  cells  may  make  it  much  less  painful  to  retain  an  option  to  change  an  object, 
even  if  that  option  Is  not  expected  to  be  exercised  often.  It  also  raises  other 
possibilities,  for  example,  objects  analogous  to  flies  on  current  timesharing  systema 


122. 


Chapter  3:  liapfomentatlona 


(objects  of  type,  say,  mutable  character  string).  In  these  objects.  It  would  be  pos- 
sible to  reach  inside  and  change  part  of  the  text  without  altering  the  rest.  If  this 
semantics  were  deemed  useful,  there  Is  no  reason  why  such  objects  could  not  be 
Implemented  in  much  the  same  manner  as  cells. 

3.3.6:  Alternative  Reference  Tree  Algorithms 

The  purpose  of  this  section  Is  to  deal  with  a couple  of  questions  that  were 
postponed  from  the  middle  of  the  foregoing  presentation  of  reference  trees.  The 
first  topic  we  discuss  is  removing  the  requirement  that  all  reference  trees  remain 
connected. 

3.3.6. It  Dieconnectlng  Reference  Trees 

An  objection  to  the  current  reference  tree  mechanism  Is  that  If  two  distant 
processors  need  to  know  about  an  object,  a whole  chain  of  Intermediate  proces- 
sors are  forced  to  know  about  It  also,  due  to  the  requirement  that  reference  trees 
must  resMin  connected.  Much  of  the  overhead  Imposed  on  these  processors  might 
be  eliminated  if  this  connection  could  be  broken.  It  is  not  difficult  to  devise  a pro- 
tocol by  which  one  of  the  intermediate  processors  could  cause  this  to  happen. 
Since  ell  communication  involving  an  object  travels  only  along  links  in  Its  reference 
tree,  however,  two  requirements  must  be  met:  a custodian  of  a copy  of  the  text 
of  the  object  must  exist  on  either  side  of  the  break  (otherwise  the  processors  In 
one  of  the  disconnected  pieces  would  have  no  access  to  the  text  of  the  object), 
and  the  object  must  be  immutable  (otherwise  an  update  performed  in  one  half  of 
the  tree  would  never  become  visible  In  the  other  half).  Effectivaly,  breaking  a link 
In  the  reference  tree  for  an  object  creates  two  reference  trees  for  the  object. 
Each  of  the  new  trees  will  then  behave  as  If  it  were  the  only  reference  tree  for 


Section  a.a.6.1 : Disconnecting  Reference  Trees 


12a. 


r 1 

ii 

■i 

:| 

!! 

!i 

I 

that  object.  SpecHIcally,  leaf  nodea  of  either  tree  may  then  delete  themselves 
from  It,  so  all  the  Intermediate  processors  In  our  example  can  leave  the  tree,  one 
by  one,  resulting  In  the  desired  situation  where  only  the  two  distant  processors 
know  about  the  object. 

In  fact,  the  only  problem  with  disconnecting  a reference  tree  arises  If  It  Is  | 

ever  desired  to  re-connect  the  tree.  The  reference  tree  management  protocol  | 

j 

avoids  cycles  In  reference  trees  by  refusing  to  make  a connection  if  two  branches  ! 

of  reference  tree  for  the  same  object  bump  Into  each  other.  This  is  done  because  ^ 

i 

it  is  assumed  that  all  branches  of  reference  tree  for  the  same  object  are  already 
connected;  therefore,  adding  another  connection  would  close  a cycle.  If,  as  the 
result  of  a disconnection,  the  two  branches  are  not  connected,  the  protocol  will 
still  refuse  to  connect  them.  It  would  be  quite  dlfRoult  to  allow  such  branches  to 
be  re-connected  without  Introducing  the  possibility  that  cycles  could  be  formed. 

Thus  it  seems  that  once  a reference  tree  is  broken  into  two  or  more  pieces,  those 
pieces  must  continue  to  exist  Independently  for  as  long  as  they  continue  to  exist. 

This  Is  not  necessarily  bad,  however.  Each  piece  is  still  free  to  grow,  shrink,  and 
move  just  as  the  original  was,  and  thus  each  may  independently  be  reclaimed  by 
the  garbage  collection  mechanism  whan  Its  usefulness  is  ended. 

i 

9.3.6.2:  Reorganising  Raferenea  Trees 

Figure  3.4  gave  an  example  of  a non-optimal  reference  tree,  and  the  accom- 
panying text  suggested  that  there  might  be  ways  of  Improving  such  trees.  Such 
strategies  were  not  Investigated  In  the  course  of  this  research,  but  a couple  of 
approaches  to  the  problem  have  suggested  themselves.  Any  approach  based  on 
purely  local  knowledge  of  the  reference  tree  should  fit  easily  into  our  scheme,  pro- 
vided It  preserves  the  essential  properties  of  reference  trees  (connectedness  and 


124. 


Chapter  3:  Implementations 


freedom  from  cycles). 


One  poestblllty  Is  for  a leaf  node  which  Is  aware  that  one  of  Its  neighbors  Is 
connected  to  the  tree  by  a different  path  (perhaps  because  of  receiving  an  R+ 
message  from  that  neighbor)  to  break  Its  old  connection  and  connect  Instead  to 
that  neighbor.  This  kind  of  operation  Is  depicted  In  Figure  3.16. 


Figure  3.16:  A simple  reference  tree  reorganization 


It  is  difficult,  unfortunately,  for  a non-leaf  node  to  make  this  kind  of  Jump,  because 
it  will  not  know  which  of  its  old  links  to  break.  If  the  wrong  choice  Is  made,  not 
only  will  the  reference  tree  become  disconnected,  but  one  half  of  it  %vill  contain  a 
cycle,  as  shown  In  Figure  3.16. 

In  addition  to  the  mechanics  of  reorganizing  the  tree,  there  Is  of  course  a stra- 
tegy question-when  is  this  wise?  Once  again,  in  simple  cases  the  answer  can  be 
fairly  obvious,  but  in  general  It  may  not  be.  If  the  goal  for  the  processor  changing 
its  links  is  to  get  closer  to  a copy  of  the  text  of  the  object,  then  It  is  obviously  a 
good  Idea  to  change  If  the  processor  being  connected  to  has  a oopy  of  the  text. 
If  it  does  not  have  a copy  of  the  text,  then  It  will  either  have  to  have  some  Idea 
how  far  the  nearest  text  Is  (a  piece  of  Information  that  might  became  obsolete 
every  time  a text  moved)  or  other  considerations  wilt  have  to  be  invoked. 


Section  3.3.6.2:  Reorganizing  Reference  Trees 


126. 


Link  Added 


Figur*  3.10:  A dengarous  reference  tree  reorganization 


3.3.01  Summery 

Thia  section  of  the  theala  dealt  in  depth  with  the  question  of  object  manage- 
ment In  our  dlatrtbuted  system.  Reference  trees  were  Introduced  as  a means  for 
keeping  track  of  objects  and  the  processors  that  know  about  them.  A detailed 
aection  described  a set  of  protocols  for  managing  these  reference  trees  In  a distri- 
buted fashion;  both  management  of  membership  In  a tree  and  management  of 
object  texts  were  discussed.  Special  attention  was  given  to  the  problem  of 
sianaging  texts  of  mutable  objects,  holding  out  hope  that  such  objects  can  be 
effectively  supported  on  our  distributed  system.  Reference  trees  lead  naturally  to 
a certain  approach  to  garbage  collection  on  distributed  systems,  which  was 


123. 


Chapter  3:  Implementations 


described.  Finally,  the  feasibility  of  Improving  reference  trees  by  allowing  various 
reorganizations  was  briefly  explored. 

Any  problem  whose  solution  Is  as  Intricate  as  that  presented  here  obviously  will 
yield  to  many  other  solutions  as  well.  The  purpose  of  this  section  was  to  give  a 
concrete  lower  bound  on  the  effectiveness  with  which  all  the  problems  surrounding 
object  management  can  be  solved.  By  describing  an  approach  that  has  actually 
been  Implemented,  It  Invites  others  to  duplicate  or  improve  on  It.  At  the  same  time, 
It  should  be  recognized  that  object  management  Is  not  the  only  problem  facing  a 
distributed  system.  The  object  management  algorithms  presented  can  assure  that 
the  system  will  operate  correctly,  but  .they  cannot  assure  that  It  will  operate 
efficiently. 

3.4i  Event  Distribution  Strategy 

It  should  be  clear  by  now  that  the  algorithm  for  assigning  events  to  processors 
is  a critical  factor  in  determining  the  success  or  failure  of  the  system.  In  fact,  It  Is 
undoubtedly  the  single  most  Important  Influence  on  the  performance  of  the  system. 
The  object  management  algorithms  described  In  this  chapter  are  significant  in  that 
they  assure  the  correct  operation  of  the  system.  No  matter  how  good  they  are, 
however,  the  system  will  not  perform  satisfactorily  unless  events  are  distributed 
Intelligently.  Along  with  being  a critical  component  of  the  system,  unfortunately, 
event  distribution  strategy  Is  also  a difficult  topic  to  approach  rigorously.  Thus  it 
seems  likely  that  the  measure  of  an  event  distribution  strategy  will  only  be  found 
by  experiment.  This  section  describes  the  extremely  simple  strategy  that  was 
used  In  the  course  of  this  research,  and  gives  what  limited  conclusions  can  be 
drawn  ee  to  Its  usefulness.  Also  presented  are,  several  Ideas  (as  yet  untested) 
which  sMy  help  In  the  effort  to  construct  more  effective  event  distribution 


Section  3.4:  Event  Distribution  Strategy 


127. 


strategies. 


a.4.1:  The  QFUDGE  Strategy 

The  simple  strategy  which  was  actualiy  used  was  just  this:  if  the  number  of 
events  on  a proceasor’s  event  iist  exceeds  the  number  of  events  on  some 
neighbor's  event  list  by  a certain  constant  (dubbed  QFUDGE)  or  more,  then  the  pro- 
cessor will  send  a new  event  to  that  neighbor  rather  than  appending  it  to  its  own 
event  list.  In  order  to  keep  current  about  the  number  of  events  on  each  others’ 
event  lists,  processors  include  that  information  about  themselves  in  all  messages 
sent  to  other  processors.  If  a processor  has  not  sent  any  messages  over  a partic- 
ular link  for  a whHo,  it  will  send  a null  message  in  that  direction  just  to  keep  the 
processor  at  the  other  end  Informed  of  the  state  of  Its  business. 

The  principal  motivation  behind  adopting  the  QFUDGE  strategy  was  that  It  was 
simple  and  yet  produced  passable  performance,  and  thus  could  be  used  while  the 
other  aspects  of  the  Implementation  were  worked  out  and  debugged.  To  give  a 
rough  Idea  of  the  elTecttveness  of  this  strategy,  we  present  some  simulation 
results.  The  simulation  was  of  a ring  of  eight  processors,  with  two  different 
choices  for  the  bandwidth  of  the  Interprocessor  links.  One  used  relatively  high- 
bandwidth  channels,  on  the  order  perhaps  of  one-megabaud  links  between  mlcropro- 
oeasors.  The  other  used  channels  fifty  times  slower,  or  approximately  twenty  kilo- 
baud compared  to  typical  microprocessor  computing  speeds.  The  parameter  that 
was  actually  varied  was  a ratio  between  computing  speed  and  channel  speed,  so 
the  estimates  given  above  must  be  considered  extremely  coarse  and  highly  depen- 
dent on  the  computing  speed  of  the  Interpreters  implemented  on  the  processors. 

The  system  was  given  twenty  independent  computations  of  the  value  of  a 
function  (always  applied  to  the  same  argument)  and  the  simulated  elapsed  time 


1 28.  Chapter  3:  Implementations 


wu  measured  from  the  point  vvhere  these  twenty  tasks  were  given  to  the  system 
(aii  at  the  same  processor)  to  the  point  where  the  iast  of  the  twenty  answers  was 
typed.  The  test  was  then  repeated  with  twenty  more  paraiiei  computations  of  the 
same  function  vaiue.  The  purpose  of  this  repetition  was  to  gauge  the  efFect  of  the 
fact  that,  after  the  first  test,  severai  object  texts  usefui  for  the  second  test 
(such  as  the  text  of  the  function)  wouid  have  become  avaiiabie  on  severai  of  the 
processors,  decreasing  the  time  required  for  an  event  being  moved  to  acquire  a 
working  set  on  its  new  processor.  The  combined  times  for  the  original  test  and 
repetition  also  give  some  Indication  of  the  probable  performance  of  the  system  had 
the  original  work  load  been  twice  as  severe. 

Table  3.17  gives  separately  the  times  for  the  first  and  second  tests,  using 
both  fast  and  slow  communication  channels,  for  QFUDGE  values  of  one  through  five. 
The  times  given  have  been  normalized  so  that  unity  represents  the  time  required  to 
do  all  calculations  on  a single  processor. 


Channmt 

Test 

1 

QFUDGE  Value 

2 3 4 

6 

Optimum 

QFUDGE 

slow 

first 

1.66 

1.77 

1.66 

1.22 

0.09 

>6 

slow 

second 

0.78 

0.61 

1.61 

0.61 

0.70 

3-4 

fast 

first 

0.68 

0.64 

0.64 

0.63 

0.61 

3-4 

fsst 

second 

0.32 

0.31 

0.41 

0.46 

0.63 

2 

Tsbie  3.1 7:  Execution  time  using  the  QFUDGE  strategy 


The  most  egregious  anomaly  In  these  figures  is  the  entry  1.61  for  the  execution 
time  of  the  second  test  using  the  slow  channel  with  a QFUDGE  value  of  3.  This 
particular  entry  corresponds  to  a peculiar  run  In  which  nineteen  of  the  twenty 
answers  were  typed  out  in  s normalized  execution  time  of  0.47.  Computation  of 
the  remaining  result  apparently  required  amazing  amounts  of  Interprocessor  commun- 
ication to  complete.  Using  a figure  closer  to  0.47  for  that  value  makes  the  table 


Section  3.4.1:  The  QFUDGE  Strategy 


129. 


look  nicer,  but  It  is  still  dangerous  to  try  drawing  too  many  inferences  from  the 
information.  Some  Intuitively  appealing  relationships  do  seem  to  be  apparent,  how- 
ever. The  execution  thnes  in  each  row  of  Table  3.17  except  the  first  seem  to 
exhibit  a minimum,  indicating  an  optimal  value  of  QFUDGE  for  that  application.  (We 
may  Infer  that  the  first  row  of  figures  also  has  a minimum,  to  the  right  of  the  table 
at  some  QFUDGE  value  greater  than  6.)  These  optimal  values  are  shown  to  the 
right  of  the  table. 

We  may  regard  too  low  a value  of  QFUDGE  as  encouraging  excessive  communi- 
cation overhead,  and  too  high  a value  as  allowing  insufficient  access  to  the  comput- 
ing power  in  other  parts  of  the  system.  It  is  thus  pleasing  (though  not  surprising) 
that  faster  communications  seam  to  favor  lower  values  of  QFUDGE,  and  that  the 
second  trials,  where  less  overhead  Is  required  to  set  up  working  sets,  do  also. 

The  figures  In  Table  3.17  also  Illustrate  both  the  potential  of  multiprocessor 
systems  such  as  the  one  simulated,  and  the  amount  of  room  that  is  left  for  improve- 
ment. Under  appropriate  circumstancea.  It  was  Indeed  possible  to  cut  total  elapsed 
computation  thee  by  a factor  of  three.  On  the  other  hand,  eight  processors  were 
employed  to  aecompliah  this  (thus  an  optimal  arrangemant  of  events  might  be 
expected  to  cut  computation  time  by  a factor  of  eight).  It  must  also  be  noted  that 
the  kind  of  trials  conducted  were  of  the  most  favorable  possible  sort  to  the  sys- 
tem. Much  work  on  scheduling  remains  to  be  dona  before  a system  such  as  this 
can  operate  effectively  on  a wide  range  of  problems. 


130. 


Chapter  3:  Implementations 


3.4.2:  Improved  Event  Dietribution  Stretegles 


We  can  imagine  work  on  improving  the  performance  of  a mu*calciilus  system 
being  directed  along  two  somewhat  interrelated  dimensions:  Improvements  to  the 
interpreter  algorithm  and  improvements  to  the  interpreter  strategy.  The  former 
Includes  such  possibilities  as  sanding  several  related  texts  in  response  to  a single 
inquiry  and  has  already  been  alluded  to.  The  purpose  of  this  section  Is  to  concen- 
trate on  the  latter. 

The  magnitude  of  the  Job  facing  an  event  distribution  strategy  can  be  appreci- 
ated by  thinking  of  a system  computing  a weather  forecast,  or  perhaps  a solution 
to  Laplace’s  equation.  Such  computations  are  characteristically  organized  as  a grid 
of  logical  "cells,"  each  with  a certain  amount  of  state  information  (e.g.,  tempera- 
ture, pressure,  or  wind  direction),  and  each  Interacting  only  with  its  neighbors  fol- 
lowing a simple  set  of  rules  (e.g.,  difference  equations  derived  from  the  laws  of 
physics).  Stepping  such  a grid  through  its  paces  causes  the  state  variables  to 
converge  on  the  desired  solution  (e.g.,  tomorrow’s  weather  forecast).  A program  of 
this  sort  could  be  written  In  the  mu  calculus  by  creating  an  object  to  model  each 
grid  point,  and  arranging  the  behavior  of  these  objects  sc  that  at  each  Iteration 
each  object  gave  rise  to  an  event  sending  to  each  of  Its  neighbor  objects  the 
value  of  each  of  Its  relevant  state  variables.  This  kind  of  problem  seems  like  an 
Ideal  one  for  our  network  to  solve,  provided  that  events  and  objects  ere  allocated 
to  processors  in  a reasonable  manner.  A reasonable  manner  would  probably  be  a 
distribution  which  related  the  logical  structure  of  the  problem  to  tho  physical  struc- 
ture of  the  system,  for  example,  mapping  several  grid  points  onto  each  processor  In 
such  a way  that  neighboring  grid  points  were  assigned  either  to  the  same  procee- 
sor  or  to  neighboring  processors. 

It  Is  falrty  likely  that  the  simple  QFUOGE  strategy  would  be  a mlaerable  failure 


Section  3.4.2:  Improved  Event  Distribution  Strategies 


131. 


in  this  re«p«ct.  Assuming  that  ail  events  and  objects  started  out  on  one  proces- 
sor,  they  would  be  randomly  farmed  out  to  Its  neighbors,  and  their  neighbors,  and 
so  on,  with  almost  no  regard  to  their  logical  relationships.  As  a result,  communica- 
tion paths  would  probably  become  many  times  longer  than  necessary.  Perhaps  the 
most  practical  solution  to  this  particular  problem  would  be  to  specify  a language  In 
which  the  programmer  could  indicate  the  optimum  distribution.  We  do  not  pursue 
this  approach  for  two  reasons:  It  requires  the  programmer  to  know  about  the  phy- 
sical  structure  of  the  system,  an  attribute  we  have  been  trying  to  hide  from  him, 
and  it  may  rtot  help  very  much  In  most  cases,  whore  the  logical  structure  of  the 
problem  Is  probably  more  fluid  and  less  known.  Another  possibility  might  be  some 
kind  of  centraUzod  scheduling  processor  with  an  overview  of  all  system  operations. 
We  do  not  pursue  this  either,  since  it  is  at  variance  with  the  basic  system  philoso- 
phy that  all  responsibility  should  be  distributed. 

It  would  be  impressive  to  come  up  with  an  automatic,  distributed  scheduling 
algorithm  that  caused  the  event  and  object  distribution  for,  say,  a weather  fore- 
casting program,  to  converge  toward  the  optimum.  A more  modest  objective,  how- 
ever, would  Just  be  a scheduling  strategy  that  avoided  doing  obviously  stupid 
things,  like  keeping  two  related  objects  far  apart  In  the  system  while  messages 
were  constantly  being  sent  back  and  forth.  To  this  end,  we  might  consider  some 
kind  of  "rubber  band"  strategy,  wherein  objects  referenced  in  an  event,  especially 
the  receiver  object  of  an  event,  could  exert  a certain  amount  of  "pull"  on  the 
event,  making  It  easier  to  ship  It  to  a neighbor  where  the  object  was  known,  and 
harder  to  send  It  where  the  object  was  not  known.  The  pull  might  be  especially 
strong  it  that  event  was  the  only  reason  for  the  objecf s being  known  on  the 
current  processor. 

Another  possibility  might  be  for  the  event  distribution  strategy  tu  take  some 


132. 


Chapter  3:  Implementations 


account  of  the  difficulty  of  accumulating  a working  set.  Perhaps  an  event  des- 
cended from  a long  line  which  had  been  proceeding  smoothly,  with  no  pauses  to 
fetch  data  from  another  processor,  would  be  kept  at  the  same  location  In  prefer- 
ence to  a newly  received  event,  or  an  event  whose  working  set  appeared  not  to 
be  present. 

Finally,  event  distribution  strategy  should  be  concerned  with  optimizing  more 
than  Just  CPU  usage.  Memory  usage  Is  another  Important  parameter.  The  distribu- 
tion of  objects  in  the  system  Is  governed  to  a considerable  extent  by  the  distribu- 
tion of  events.  Thus  an  event  distribution  strategy  might  try  to  gauge  the  quantity 
of  objects  required  to  be  near  each  event  (this  is  probably  the  same  as  the  work- 
ing set  of  the  event)  and  try  to  collect  an  appropriate  mix  on  each  processor, 
depending  on  each  processor’s  memory  size  and  computing  speed. 

3.4.3:  Storage  Management  Strategy 

The  Job  of  the  garbage  collector  is  the  identification  and  removal  of  objects 
that  are  no  longer  accessible  in  the  system.  Storage  management  strategy  goes 
beyond  this  to  Include  also  the  handling  of  objects  which,  although  still  accessible, 
are  not  likely  to  be  needed  at  a particular  processor.  A useful  view  of  the  memory 
of  any  individual  processor  in  the  system  is  as  a cache  containing,  ideally,  the  data 
upon  which  the  processor  Is  most  probably  about  to  operate.  If  a processor 
requires  access  to  information  (i.e.,  an  object  text)  not  available  In  its  ’’cache," 
then  that  data  must  be  fetched  from  elsewhere  In  the  system. 

In  light  of  the  similarity  between  storage  management  strategy  and  garbage 
collactlon,  it  is  reasonable  to  assign  to  the  garbage  collector  the  chief  responsibil- 
ity for  managing  this  cache,  and  In  particular  for  pruning  dead  wood  from  It  to  make 
room  for  new  growth.  The  discussion  In  this  section  will  assume  a traditional  kind 


Section  3.4.3:  Storage  Management  Strategy 


133. 


of  mark-and-sweep  garbage  coilector[1 7]  which  runs  at  intervals  to  free  up  memory 
space  that  la  not  being  used  productively.  Most  of  the  considerations  mentioned, 
however,  also  apply  to  a reahtima  incremental  garbage  collector  such  as  that 
described  by  Baker[1]. 

The  first  step  for  the  garbage  collector  running  on  a processor  is  to  determine 
which  objects  are  being  used  on  that  processor  and  which  are  not.  This  may  be 
done  by  starting  with  the  event  list  and  marking  every  object  referenced  from  any 
event.  If  the  text  of  any  of  these  objects  is  present,  all  objects  referenced  in 
that  text  should  liicswlse  be  marked,  and  so  on.  This  marking  phase  is  not 
sufficient  in  Itself,  however.  Any  objects  having  texts  on  this  processor  but  also 
krxMvn  on  other  processors  must  also  be  marked,  along  with  all  objects  reachable 
from  them.  This  is  because,  even  though  none  of  the  objects  so  marked  may  be 
reachable  from  any  event  on  this  processor,  the  globally-known  object  may  be 
reachable  from  an  event  active  on  some  other  processor,  whereupon  objects  reach- 
able from  this  object  will  also  be  accessible  to  that  event,  even  though  these 
latter  objects  might  currently  be  known  only  on  this  processor.  It  may  be  desirable 
to  employ  two  different  kinds  of  marks  in  this  stage,  since  the  marks  will  be  used 
for  more  than  one  purpose  (cache  management  as  well  as  outright  deletion  of 
objects). 

Any  object  not  marked  at  ell  during  this  procedure  must  be  known  only  on  this 
processor  (since  all  objects  known  elsewhere  were  marked)  and  must  not  be 
accessible  from  any  event  on  this  processor  or  any  object  known  on  any  other  pro- 
cessor. Such  an  object  is  therefore  garbage  and  can  be  deleted.  This  Is  not  the 
only  way  the  garbage  coilactor  can  reclaim  storage,  however.  If  there  are  any 
objects  known  also  on  other  processors  and  not  directly  reachable  from  any  event 
on  this  processor,  it  may  br  appropriate  to  reduce  this  processor's  involvement 


134. 


Chapter  3:  Implementations 


with  those  objects.  If  this  processor  has  copies  of  any  of  their  texts,  It  is  prob- 
ably desirable  to  delete  those  copies  or  send  them  elsewhere  If  possible.  If  this 
processor  Is  a leaf  node  in  any  of  their  reference  trees.  It  is  most  likely  worthwhile 
for  this  processor  to  remove  Itself  from  those  reference  trees  altogether. 

There  may  be  objects  on  this  processor  which  are  not  known  on  any  other  pro- 
cessor and  are  not  directly  reachable  from  any  event  on  this  processor,  and  yet 
are  not  garbage-collectable.  These  objects  must  be  kept  because  they  were 
marked  In  the  second  phase  described  above  as  being  reachable  from  other 
objects  known  outside  this  processor.  Since  they  cannot  be  deleted  outright,  part 
of  the  storage  management  strategy  question  for  these  objects  is  whether  to  keep 
them  at  their  present  location  or  send  them  to  a neighbor.  This  question  cannot  be 
answered  on  the  basis  of  statistics  gathered  by  the  garbage  collector  (at  least  not 
of  the  form  described  above)  and  should  presumably  be  answered  after  considering 
the  amount  of  storage  available  on  this  processor  and  its  neighbors,  or  on  the  basis 
of  some  estimate  of  which  processor  is  most  likely  to  eventually  use  these  objects. 
In  the  simple  approximation  used  In  the  simulation  of  our  system,  such  objects  were 
allowed  to  stay  put.  Note  that  If  any  other  processor  actually  asks  for  these 
objects,  they  will  then  be  known  on  more  than  one  processor  and,  at  the  next  ga^ 
bage  collection,  may  be  shipped  away  from  the  processor  that  had  been  assuming 
a storage  role. 

Armther  category  of  object  the  garbage  collector  may  want  to  deal  with  is 
composed  of  objects  that  are  reachable  from  active  events  on  the  processor  but 
are  distantly  enough  related  to  those  events  that  it  may  not  be  worthwhile  to  keep 
them  close  at  hand.  Our  simulation  simply  left  these  alone,  but  a more  sophisti- 
cated algorithm  might  take  Into  account  some  of  the  consideratiora  mentioned 


Section  3.4.3:  Storage  Management  Strategy 


136. 


Even  with  the  simple  strategies  actually  used,  typical  garbage  collections 
reclaimed  half  or  more  of  the  space  In  use  at  the  time  the  garbage  collection 
started-probably  an  indication  that  a relatively  large  proportion  of  objects  have 
short  lifetimes  and  never  become  known  outside  the  processor  where  they  were 
created.  Of  course,  this  statistic  Is  probably  also  an  indication  that  the  amount  of 
available  space  on  aaoh  processor  vastly  exceeded  the  amount  needed  for  the 
computations  that  were  attempted.  At  any  rate,  the  inclusion  of  good  storage 
management  heuristics  In  the  garbage  collector  can  probably  assist  a good  event 
distribution  strategy  to  achieve  optimum  results. 

3.4.4:  Conclusions  Regarding  Event  Distribution 

We  have  discussed  the  importance  of  good  event  distribution  and  storage 
management  strategies  and  some  approaches  to  their  design.  The  performance  of 
the  simple  QFUOQE  event  distribution  strategy  was  evaluated.  This  strategy  falls 
far  short  of  obtaining  maximum  system  throughput,  but  does  manage  to  make  the 
system  run  considerably  faster  than  a single  processor.  An  analysis  of  some  of  the 
demands  facing  event  distribution  strategies  led  to  the  conclusion  that  some  kind 
of  "rubber  band"  strategy  might  be  an  improvement.  Also  discussed  were  the  util- 
ity of  using  working  set  and  memory  use  statistics.  Hopefully,  Improved  strategies 
using  these  ideas  can  be  combined  with  Improvements  in  algorithms  for  accumulat- 
ing working  sets  to  give  a much  more  convincing  demonstration  of  the  validity  of 
the  kind  of  multiprocessor  system  described  in  this  chapter. 


136. 


Chapter  3:  Implementations 


3.5:  Conclusions  Regarding  Our  Implamantatlon 


This  chapter  has  been  devoted  to  outlining  a possible  Implementation  of  the  mu 
calculus  on  a distributed  system  and  commenting  on  plausible  alternative  Implemen- 
tations. Although  the  description  has  not  been  at  the  bit-diddling  level,  all 
significant  aspects  of  the  Implementation  have  been  discussed;  a simulator  for  a 
multiple-processor  Implementation  using  these  algorithms  has  actually  been  con- 
structed. 

The  implementation  described  In  this  chapter  Is  a system  based  on  object 
references  that  uses  an  event  list  to  keep  track  of  pending  computations.  The 
physical  structure  assumed  Is  a cellular  network  In  which  each  processor  has  a 
certain  limited  number  of  immediate  neighbors  with  which  it  can  communicate 
directly.  A system  standard  external  representation  for  object  references,  object 
texts,  and  processor  ID’s  Is  assumed,  although  their  internal  representations  can 
differ.  Object  management  Is  done  by  means  of  reference  trees,  which  can  be 
maintained  by  means  of  a strictly  distributed  algorithm  (no  centralized  control)  using 
protocols  presented  in  the  chapter.  Garbage  collection  and  management  of  mutable 
objects  can  also  be  integrated  naturally  into  the  reference  tree  scheme.  Finally,  a 
simple  event  distribution  strategy  was  described  and  more  effective  ones  sug- 
gested. 

The  simple  strategies  used  in  the  simulated  implementation  were  sufficient  to 
yield  a significant  speed  Increase  In  some  cases  over  a single  processor  following 
th'i  same  algorithm.  It  Is  much  more  difficult,  of  course,  to  arrive  at  meaningful 
figures  comparing  the  performance  of  this  system  to  that  of  entirely  differently 
organized  systems  solving  the  same  problems.  Since  the  approach  presented  here 
Is  very  general.  It  Is  probably  safe  to  say  that,  for  any  particular  problem,  It  Is  pos- 
sible to  find  an  Implementation  more  efficient  than  that  given  In  this  thesis,  just  as 


Section  3.6*.  Conclusions  Regarding  Our  Implementation 


137. 


speclahpurpose  hardwara  can  usually  ba  constructad  to  solva  a problam  more 
quickly  than  a ganaraK-purpoaa  computer.  However,  general-purpose  computers 
have  the  advantage  of  flaxibiitty,  and  thus  there  Is  a market  for  them.  With  addi- 
tional refinement,  there  Is  reason  to  hope  that  multiprocessor  systems  of  the  kind 
described  In  this  chapter  can  be  realistic  entrants  Into  the  general-purpose  mul- 
tiprocessor system  sweepstakes,  although  they  will  still  undoubtedly  be  outper- 
formed in  specific  applications  by  more  specialized  systems. 


138. 


Chapter  3t  Implementations 


r 

[i 


i 

i 

I 

1 


--'i 

I 

\ \ 

Chaptar  4:  Conclusions  and  Dlraetions  for  Future  Work  I 


This  thesis  can  be  taken  as  a study  of  two  fairly  unrelated  concepts:  the 
semantics  of  message  passing,  and  the  architecture  of  an  object-reference  system 
for  multiple  processors.  Both  of  these  areas  of  research,  however,  are  dictated  by 
our  top-level  goal  of  developing  a methodology  for  replacing  single  large  computers 
with  networks  of  smaller  computers. 

The  mu  calculus  set  forth  in  Chapter  2 supplies  a semantic  basis  In  terms  of 
which  programs  to  be  run  on  such  a system  can  be  expressed.  A very  spare 
language,  the  mu  calculus  is  not  intended  for  direct  use  as  a programming  tool,  but 
rather  as  a target  language  for  translators  which  might  be  developed.  Thus  the  pri- 
mary emphasis  is  not  on  style  but  on  the  variety  of  semantic  elements  included. 
These  elements  form  a useful  set,  allowing  realistic  programs  to  be  written  while 
also  providing  some  novel  capabilities  (tokens)  useful  on  distributed  systems.  The 
mu  calculus  Is  also  of  Interest  In  its  own  right  as  a simple  model  of  message- 
passing computation. 

Chapter  3 describes  a distributed  object  reference  system  which  could  be 
used  to  implement  the  mu  calculus.  The  system  architecture  chosen  is  a cellular 
network  of  processors,  with  each  processor  having  a limited  number  of  nearest 
neighbors.  Although  many  other  architectures  are  possible,  most  can  be  made  to 
imitate  this  architecture.  Furthermore,  a large  enough  network  of  processors  (e.g., 
exceeding  the  capacity  of  an  individual  Ethernet  or  ring)  is  almost  forced  to  have 
this  kind  of  architecture,  at  least  If  viewed  at  the  proper  level  of  aggregation.  The 
, Implementation  described  includes  algorithms  for  the  assignment  of  objects  and 

events  to  processors  and  for  doing  the  necessary  bookkeeping.  Simulations  of  the 
I Implementation  show  that  significant  use  can  be  made  of  the  parallel  capabilities  of 

I Chapter  4:  Conclusions  and  Directions  for  Future  Work  130. 


I 


world  of  distrlbutod  computing  noods  is  a system  designed  from  the  top  down, 
rather  than  from  the  twttom  up.  That  top-down  approach  may  have  been  obscured 
somewrhat  in  the  pages  that  foliow,  thus  it  is  appropriate  to  review  it.  Our  top*levei 
goal  is  the  desire  to  be  able  to  use  many  parallel  processors  effectively;  the  mu 
calculus  is  in  the  middle  of  the  hierarchy,  and  our  implementation  is  at  the  bottom. 

How  then  do  we  go  from  this  top-level  goal  to  the  realization  described  in  this 
thesis?  The  original  stimulus  for  this  research  was  the  problem  of  replacing  one 
large  processor  with  a multitude  of  smaller  processors  having  roughly  similar  total 
capacity.  One  motivation  for  this  Is  that  the  collection  of  smaller,  slower  proces- 
sors Is  IHcely  to  cost  lees;  another  is  the  ease  of  adding  or  removing  increments  of 
computing  power  In  such  a system.  Obviously,  such  a network  can  only  be  suc- 
cessful If  enough  of  the  smaller  processors  can  bo  kept  busy.  In  other  words,  a 
sufflclont  amount  of  parallelism  must  be  found  In  the  tasks  given  to  the  system  that 
its  tremendous  capacity  can  actually  be  utilized.  There  are  various  ways  in  which 
parallelism  might  bo  discovared  in  programs  written  for  a single  sequential  machine, 
or  programs  might  be  written  with  more  parallelism  explicit.  In  the  opinion  of  the 
author,  the  second  alternative  is  the  more  promising,  but  the  thesis  does  not  take  a 
position  on  this  issue.  Rather,  the  first  step  in  the  top-down  development  is  the 
specification  of  a language  In  which  this  parallelism  can  be  expressed  well,  what- 
ever the  route  by  which  it  may  have  been  arrived  at.  This  language  Is  the  mu  cal- 
culus. 

Having  defined  the  mu  calculus  ee  an  embodiment  of  the  semantic  concepts 
that  should  be  supported  by  our  network,  the  next  step  in  the  top-down 


I 

j 

1 


140. 


Chapter  4:  Conclusions  and  Directions  for  Future  Work 


development  Is  to  design  the  network.  The  mechsnlcs  of  the  mu  calculus  favor  an 
object-reference  system  with  garbage-collected  storage.  Additionally,  we  should 
like  the  allocation  of  objects  and  events  to  processors  to  be  as  flexible  as  possi- 
ble, to  give  the  maximum  latitude  for  rearranging  things  to  keep  all  processors  as 
busy  as  possible.  These  desiderata,  In  turn,  are  all  satisfled  by  the  reference  tree 
mechanism  for  object  management. 

4,1:  Alternatives  to  Our  Design 

Our  design  is  certainly  not  the  only  possible  result  of  a top-down  effort  to 
design  a distributed  system.  Obviously,  If  the  objective  of  the  design  effort  were 
different,  for  example,  the  development  of  a special-purpose  rather  than  a general- 
purpose  system,  the  result  could  differ.  Even  given  the  same  objective,  there  are 
many  ways  to  satisfy  It.  The  full  generality  of  the  reference  tree  mechanism  may 
be  unwarranted  In  many  cases,  where  a simple  strategy  of  permanently  or  semi- 
permanently attaching  an  object  to  a designated  processor  might  seem  preferable. 
Garbage  collection  may  often  be  considered  an  unnecessary  luxury,  and  manual 
storage  management  adopted  instead.  Changes  such  as  these  can  be  worked  Into 
the  reference  tree  scheme  without  requiring  a complete  redesign.  The  use  of 
object  references,  at  least  at  some  level  of  aggregation  of  data,  can  be  a great 
aid  in  communicating  between  processors,  and  probably  would  be  retained  In  almost 
any  design. 

Even  the  use  of  message  passing  as  a semantic  basis  for  the  system  is  more 
a matter  of  style  than  content.  The  mu  calculus  seems  to  do  the  best  at  Incor^ 
porating  those  features  which  raise  Interesting  points  about  the  design  of  distri- 
buted systems  and  leaving  out  those  which  only  result  In  uninteresting  detail.  It  Is 
thus  wall  suited  to  an  exposition  of  these  points,  such  as  this  thesis.  The 


Section  4.1 : Alternatives  to  Our  Design 


141. 


•dvantaga  of  tha  mj  calculus  Is  In  aHoudng  us  a vary  flna-gralned  look  at  a 
computatlc:^- there  is  no  mechanism  hidden  In,  say,  subroutine  calls  and  returns,  or 
expression  evaluation.  We  can  therefore  avoid  axplicit  discussion  of  how  these 
might  be  implemented,  and  concentrate  on  more  primitive  mechanisms.  This  is  not 
to  say  that  expression  evaluations  and  subroutine  calls  and  returns  are  not  occur- 
ring, only  that  they  are  happening  by  means  of  series  of  simpler  operations  whose 
Implementations  have  already  been  specified.  Thus  the  mu  calculus  is  a good  tool 
for  exposing  the  organizational  aspects  relevant  to  the  system  design,  although  not 
necessarily  the  best  foundation  for  a particular  system. 

On  the  other  hand,  the  mu  calculus  Is  well  matched  to  the  kinds  of  activities 
that  are  important  in  distributad  systems.  The  breaking  down  of,  for  example, 
expression  evaluations  and  subroutine  entry  and  exit  into  more  elementary  opera- 
tions Increases  the  number  of  options  available  to  a distributed  system  in  handling 
these  activities. 

Other  than  the  choice  of  a semantic  basis,  the  major  decision  point  at  which 

\ 

many  alternatives  are  available  is  In  tha  selection  of  a network  topology  to  form 
the  hardware  base  for  tha  distributed  system.  This  choice  Is  affected  by  econom- 
ics and  scale,  as  well  as  by  the  range  of  applications  envisioned.  For  a small  net- 
work, shared  memory  might  be  the  most  appropriate,  offering  high  bandwidth  and 
essentially  zero  communication  delay.  For  a network  of  modest  size,  a ring  or 
Ethernet  might  be  a good  choice.  However,  there  are  technological  limitations  on 
the  number  of  processors  that  can  be  attached  to  one  ring  or  Ethernet.  There  is 
also  the  fact  that  the  ring  or  Ether  will  tend  to  become  a bottleneck  as  more  pro- 
cessors are  attached  to  it.  Thus  the  cellular  network  assumed  In  this  thesis  seems 
to  be  the  only  topology  with  the  potential  of  scaling  up  indefinitely.  Of  course,  it 
might  be  more  effective  for  mxles  In  the  network  to  be,  say,  small  clusters  of 


142. 


Chapter  4:  Conclusions  and  Directions  for  Future  Work 


processors  with  shared  memory,  and  negotiations  Internal  to  such  a cluster  might 
be  conducted  on  a different  basis.  Nevertheless,  our  network  philosophy  seems  to 
support  the  greatest  range  of  scaling  up  and  down,  both  In  size  and  performance. 
It  has  the  additional  advantage  of  conforming  to  the  "thin-wire"  philosophy  articu- 
lated by  Metcalfe[21  ];  thus  strategies  suitable  for  It  are  likely  to  be  applicable 
to,  if  not  optimal  for,  almost  any  network. 

Once  the  basic  design  decision  to  support  mu-calculus-llke  message  passing  on 
a cellular  network  has  been  made,  the  number  of  possible  ways  to  fNI  out  the  pic- 
ture dwindles,  although  there  Is  doubtless  still  plenty  of  scope  for  ingenuity.  This 
thesis  has  shown  one  way  to  proceed  from  this  design  decision  to  (Wing  In  the  rest 
of  the  picture.  The  implementation  presented  is  complete;  it  has  been  tested  and 
found  able  to  make  at  least  some  use  of  a network  of  processors,  its  potential, 
however,  is  much  greater.  Further  research  promises  to  yield  Implementations  that 
come  much  closer  to  realizing  this  potential. 

4.2:  The  Mu  Calculus 

There  are  several  possible  avenues  for  the  further  development  of  the  mu  cal- 
culus. The  basic  definition  of  the  pure  mu  calculus  plus  tokens  seems  fairly  sound, 
undoubtedly  has  many  interesting  properties  that  have  not  yet  been 
discovered.  Evaluations  of  the  mu  calculus  as  a tool  for  understanding  message 
passing  might  be  especially  welcome.  The  translations  between  the  mu  and  lambda 
calculi  wwre  only  Informally  Justified.  Formal  correctness  proofs  of  these  would  give 
additional  Insight  into  the  relationship  of  the  two  calculi.  Proofs  of  various  proper- 
ties of  the  actors  constructed  out  of  tokens,  such  as  the  parallelism  sctor  v and 
the  mu-calculus  Y operator,  would  help  In  understanding  these  actors. 

A bettor  axiom  scheme  for  handling  cells  and  other  mutable  objects  would  be  a 


Section  4.2:  The  Mu  Calculus 


143. 


great  contribution.  The  scheme  given  in  this  thesis  is  adequate  for  describing  for- 
mally what  these  objects  do,  but  is  of  little  use  In  proving  anything  interesting. 
Proofs  of  properties  of,  for  example,  the  arbiter  actor  a,  would  also  be  illuminating. 
Work  done  by  Hewittri6]  and  Grelf[11]  is  relevant  here. 

Finally,  the  development  of  the  mu  calculus  Into  a humanly  usable  programming 
language  Is  a necessary  prerequisite  to  actually  building  and  using  any  system 
based  on  the  mu  calculus. 

4.3:  Implamentatloos 

The  possibilities  for  further  work  In  the  design  of  distributed  systems  are 
almost  too  numerous  to  mention.  Other  object  management  schemes  are  possible; 
perhaps  a more  compact  object  management  protocol  can  be  developed.  Great 
amounts  of  creativity  can  be  absorbed  in  the  development  of  new  strategies  for 
managing  various  kinds  of  mutable  objects.  Schemes  for  reorganizing  reference 
trees  may  substantially  improve  the  performance  of  the  system.  In  fact,  even  the 
gathering  of  statistics  to  gauge  the  possible  effect  of  such  strategies  would  be  a 
significant  step.  The  garbage  collection  scheme  presented  in  this  thesis  is  at  best 
Imperfectly  understood.  Are  there  real  cases  In  which  some  garbage  will  never  be 
collected?  Also,  how  can  it  be  modified  to  incorporate  an  incremental  garbage  col- 
lector such  as  that  of  Baker[1]? 

The  development  of  improved  event  distribution  and  storage  management  stra- 
tegies promises  to  have  a major  effect  on  system  performance.  This  is  probably 
the  area  of  research  in  which  additional  results  are  the  most  crucially  needed. 

Finally,  ail  theas  suggestiorw  pertain  only  to  making  improvements  on  the 
design  presented  in  this  thesis.  The  acope  for  Imagination  and  originality  in  devis- 
ing new  approaches  to  distributed  computing  Is  practically  limitloss. 


144. 


Chapter  4:  Conclusiorw  and  Okeotlons  for  Future  Work 


Appandix  A:  Corractnass  of  tha  Mambarahip  Protocol 


1 

I. 

i. 

This  appandlx  tails  the  story  of  how  the  reference  tree  membership  protocol 

i 

was  developed  and  tested.  Originally,  a much  simpler  membership  protocol  (having 

i 

; only  four  states  Instead  of  thirteen)  was  invented  and  used  as  the  basis  for  an 

early  Implementation  of  a simulator  for  a multiple-processor  message-passing  sys- 
tem. After  a protracted  period  of  debugging  failed  to  produce  reliable  operation, 
the  author  began  to  suspect  the  protocol  itself.  This  led  to  the  construction  of  a 
LISP  program  for  tasting  protocols.  The  ability  of  this  program  to  find  weaknesses 
In  the  protocol  and  enable  tracing  of  the  circumstances  under  which  these 
weaknesses  would  manifest  themselves  was  invaluable  in  arriving  at  the  protocol 
finally  used.  This  appendix  describes  the  structure  and  use  of  this  protocol-testing 

i 

program,  both  for  Its  own  interest  and  for  the  confidence  it  gives  us  that  tha  proto- 
cols presented  In  this  thesis  are  correct, 

i 

I 

A.1s  The  Protocol-Tasting  Program  ] 

i 

The  reference  tree  protocols  deal  with  the  state  of  an  individual  node  In  a net-  j 

work  and  how  it  reacts  to  various  stimuli;  the  protocol-testing  program  uses  the  I 

i 

j 

state-transition  rules  in  a protocol  to  enumerate  the  possible  states  for  an  entire 
network.  The  state  of  a network  Is  considered  to  be  an  element  of  the  cross  pro- 
duct of  the  possible  states  of  all  nodes  In  the  network  and  the  possible  Instantane- 
ous contents  of  all  communication  links  In  the  network  (i.e.,  all  messages  sent  but 
not  yet  received).  There  Is  no  reason  to  assume  that  the  set  of  possible  states  of 
a network  is  fkilte-lt  may  be  possible  for  unbounded  numbers  of  messages  to  pile 
up  on  the  links.  In  fact,  the  original  reference  tree  membership  protocol  led  to  ne^ 
works  with  this  property.  Of  course,  this  makes  it  impossible  for  the  protocol 

Section  A.  1:  The  Protocol-Testing  Program  led. 


F 


taster  to  enumerate  ail  possible  states. 

Fortunately,  there  are  protocols  which  coniine  networks  using  them  to  have 
only  Unite  numbers  of  possible  states  (I.e.,  states  accessible  from  some  approved 
starting  state).  These  protocols  can  be  analyzed  exhaustively  and  thus  are  more 
attractive  In  the  absence  of  good  theoretical  tools  for  studying  protocols.  Even 
though  the  number  of  accessible  network  states  may  be  finite,  it  may  still  be  quite 
large,  especially  if  the  network  la  complex.  Furthermore,  it  is  not  obvious  how  a 
test  using  one  particular  network  topology  generalizes  to  other  topologies.  For 
these  and  other  reasons,  the  protocol  tester  was  actually  given  an  abstraction  of  a 
network  In  which  attention  was  focused  on  Just  two  adjacent  nodes  and  the  link 
between  them.  Assumptions  about  the  behavior  of  the  network  were  introduced  by 
restricting  the  rules  for  spontaneous  state  transitions,  leaving  only  those  transi- 
tions that  might  actually  be  caused  by  the  hypothesized  activities  of  the  rest  of 
the  network.  The  choice  of  these  assumptions  was  governed  by  properties  being 
tested  (e.g.,  consistency,  resistance  to  disconnecting  the  tree,  etc.}. 

Thus  the  "network"  for  our  protocol  tester  consists  of  two  nodes  (which  we 
may  call  A and  B)  and  the  link  between  them  (which  is  actually  modeled  as  a pair 
of  FIFO  message  queues,  one  containing  messages  traveling  from  A to  B,  the  other 
containing  messages  traveling  from  B to  A-let  us  call  these  queues  AB  and  BA, 
respectively).  The  complete  state  of  this  network  is  a combination  of  the  individual 
states  of  nodes  A and  B,  and  the  contents  of  queues  AB  and  BA,  whose  elements 
are  signiflcant  both  by  their  identity  and  their  position  (the  assumption  is  that  mes- 
sages never  "pass"  each  other-no  message  Is  ever  received  before  another  mes- 
sage sent  esrtter  in  the  same  direction).  A state  transition  for  the  network  occurs 
any  time  a state  transition  rule  from  the  protocol  (as  restricted  by  asaumptiona 
about  the  rest  of  the  network)  is  applied  to  either  A or  B.  If  a rule  calls  for  a 


140. 


Appernflx  A;  Correctness  of  the  Membership  Protocol 


spontaneous  transition  of,  say,  A,  then  the  new  network  state  is  derived  from  the 
old  network  state  by  changing  the  state  of  A to  Its  new  value  and  appending  the 
output  (If  any)  accompanying  the  transition  to  A's  output  queue  AB.  If  the  transi- 
tion is  caused  by  receiving  a message,  then  that  message  must  be  at  the  head  of 
the  processor’s  Input  queue  (BA  In  the  case  of  A),  whereupon  that  message  will  be 
removed  from  the  queue,  the  state  transition  effected,  and  the  resulting  output  (If 
any)  appended  to  the  output  queue  as  before.  The  network  state  after  the  transi- 
tion Is  the  state  resulting  after  all  three  of  these  operations  have  been  performed. 

The  protocol  tester  simply  applies  all  applicable  rules  to  each  state  In  a set  of 
Initial  states  to  generate  a larger  set  of  accessible  states.  The  process  is  then 
repeated  for  each  new  state  added  to  this  set  until  applying  any  applicable  rule  to 
any  state  in  the  set  always  yields  another  state  already  in  the  set.  At  this  point 
the  procedure  terminates  and  the  final  set  may  be  Inspected  to  see  If  It  contains 
any  undesirable  members  or  does  not  contain  some  members  that  were  expected. 

Obviously  the  output  from  this  program  Is  only  as  valid  as  the  assumptions  used 
In  formulating  the  state  transition  rules,  so  the  results  necessarily  constitute  at 
beat  an  informal  demonstration  of  the  soundness  of  the  protocols  unless  the 
asaumptione  are  rigorously  Justified.  Acceptable  results  of  tests  with  this  program 
have  combined,  however,  with  the  positive  empirical  evidence  gained  by  using  the 
protocols  to  give  the  author  reasonable  confidence  that  the  protocol  Is  correct; 
several  bugs  were  found  in  the  process  of  implementing  the  protocol  finally  chosen, 
and  In  every  Instance  the  problem  was  found  to  be  a failure  to  Implement  correctly 
the  desired  protocol,  rather  than  a weakness  In  the  protocol  itself. 


Section  A.1:  The  ProtocohTestIng  Program 


147. 


A.2:  TMtino  th«  MainlMrshlp  Protocol 

A test  of  the  membership  protocol  should  look  for  seversi  properties: 

• closure-s)l  states  reachable  from  any  possible  starting  state  should  be 
states  that  can  be  handled  following  the  rules  of  the  protocol;  no  processor 
should  ever  receive  a message  that  it  cannot  handle  in  its  current  state. 

• consistency-all  stable  network  states  (i.e.,  states  in  which  all  message 
queues  are  empty)  reachable  from  any  possible  starting  state  should  show 
the  desired  relationship  between  processor  states;  for  example,  we  would 
not  like  to  see  a stable  state  in  which  two  processors  each  thought  they 
were  masters  of  the  same  link. 

a resistance  to  dlaconnection-we  do  not  want  our  protocol  to  allow  a refer- 
ence tree  to  become  disconnected. 

a resistance  to  forming  cycles-we  do  not  want  our  protocol  to  allow 
undirected  cycles  to  be  formed  in  a reference  tree. 

The  test  described  in  this  section  was  primarily  concerned  with  demonstrating  clo- 
sure and  consistency-the  most  obvious  trouble  spots  in  earlier  protocols.  Com- 
ments will  be  made  later  about  resistance  to  disconnection  and  forming  cycles. 

For  this  test,  state  X7  of  the  membership  protocol  was  split  into  two  states  X? 
and  XI.  The  new  state  XI  follows  state  transition  rules  similar  to  XT,  but  records 
the  fact  that  a processor  In  state  X?  received  a reference  to  the  object  after 
entering  that  state.  Although  it  does  not  formally  change  the  state  of  the  proces- 
sor with  respect  to  that  object,  receiving  a reference  vrhiie  in  state  XT  modules 
the  behavior  of  the  proceasor  In  several  ways.  For  one,  since  the  processor  once 


148. 


Appendix  A:  Correctness  of  the  Membership  Protocol 


i 


again  haa  a reference  to  the  object.  It  can  now  aend  one  back,  whereas  a proces- 
sor In  state  X7  that  has  not  received  s reference  should  have  no  references  to 
that  object  to  send  (having  Just  finished  Indicating  Its  desire  to  leave  the  refer- 
ence tree  for  that  object  altogether).  The  second  difference  is  that  a processor  in 
state  XI  should  not  ever  receive  an  A-  message,  as  a processor  In  state  X?  might. 

This  is  because  an  A-  message  in  this  context  finalizes  a deletion  from  the  refer- 
ence tree,  which  is  improper  If  a processor  in  state  X?  has  once  again  come  Into 
possession  of  an  object  reference.  Once  sgain.  It  is  important  to  note  that  the 
addition  of  "state"  XI  is  simply  an  accounting  maneuver  to  give  us  a clearer  per- 
ception of  what  is  happening  to  the  system-it  does  not  imply  that  a processor  In 
operation  explicitly  needs  to  keep  track  of  whether  it  is  in  state  X?  or  XI.  The 
mechanism  that  keeps  reference  trees  connected  is  present  In  other  states  (not- 
ably SR  and  SR?);  the  purpose  of  introducing  XI  is  only  to  check  (for  our  test) 
that  this  Is  really  happening  correctly. 

We  now  give  the  state  transition  rules  used  in  the  membership  protocol  test; 
additional  comments  regarding  the  relevance  of  this  particular  set  will  appear  with 
the  Justifications  of  individual  rules.  Rules  input  to  the  protocol  tester  have  the  fol- 
lowing form: 

(current-state  Input  output  next-state) 

An  Input  entry  of  nil  indicates  a spontaneous  transition;  an  output  entry  of  nil  indi- 
cates a "quiet"  transition-one  that  is  not  accompanied  by  any  output. 

The  following  rules  were  taken  directly  from  the  definition  of  the  membership 
protocol  (Table  3.0): 

I 

i 


Section  A.2:  Testing  the  Membership  Protocol 


140. 


(X  ntLnil  N) 
(X  R*  '*  S) 

(N  nil  nil  X) 
(N  nil  R*  M?) 
(N  R*  - N?1) 
(M  nil  R-  M) 
(M  nil  ♦ S) 

(M  nil  - X?) 
(M  R-  nil  M) 
(S  nil  R-  SR) 
(S  ♦ nil  M) 


(S  - A-  N?1) 

(S  R-  nil  S) 

(SR  nil  R-  SR) 
(SR  ♦ nil  M) 

(SR  - A+  S7) 
(SR  R-  nil  SR) 
(N7  nil  nil  X7) 
(N7  nil  R-  N7) 
(N7  A*  A-  N71) 
(N7  A-  A-  N) 
(N7  R-  nil  N7) 


(N71  nil  R*  M72) 
(N71  A-  nil  N) 
(N71  R-  nU  N71) 
(M7  nil  R-  M7) 
(M7  R+  - ¥71) 
(M7  * nil  M) 

(M7  - A-  N) 

(M7  R-  nil  M7) 
(M71  nil  R-  M71) 
(¥71  - A-  M71) 
(¥71  R-  nH  ¥71) 


(¥72  nil  R-  ¥72) 
(¥72  A-  nil  ¥7) 
(¥72  R-  nil  ¥72) 
(S?  nil  R-  SR7) 
(87  k*  nil  S) 

(S7  A-  A-  N) 

(S7  R-  nil  S7) 
(SR7  nil  R-  SR7) 
(SR7  A*  nil  SR) 
(SR7  A-  A-  N) 
(SR7  R-  nil  SR7) 


For  state  X7  we  have  the  following  rules: 


(X7  nil  nil  N7)  (X7  A+  A+  ¥)  (X7  A-  A-  X)  (X7  R-  nil  X!) 

These  rules  are  the  same  as  the  state  transition  rules  given  in  Table  3.6  for  state 
X7,  except  that  the  transition  upon  receiving  an  R-  message  Is  to  the  parallel 
state  XI,  recording  the  receipt  of  an  object  reference,  instead  of  back  to  X7.  Also, 
the  transition  ''(X7  nil  R-  X7)"  has  been  omitted,  since  it  can  only  occur  if  the  pro- 
cessor has  received  a reference  to  the  object  after  Its  transition  to  state  X7.  In 
our  accounting,  such  a processor  would  be  In  state  XI  Instead. 

The  rules  for  state  XI  are 


(XI  nil  nil  N7)  (X!  nil  R-  X!)  (X!  A+  A+  M)  (X!  R-  nil  X!) 

These  rules  are  derived  from  the  rules  for  state  X7  by  changing  all  references  to 
X7  into  Xl-once  an  object  reference  has  bean  received  while  in  state  X?,  that 
fact  will  be  remembered  until  a further  transition  (to  N?  or  ¥)  occurs.  The  transi- 
tion "(X!  A-  A-  X)"  has  been  omitted,  since  an  A-  message  (finalizing  a break  in  the 
tree)  should  never  be  received  if  an  object  reference  was  received  first. 

This  completes  our  presentation  of  the  state  transition  rules  used  for  testing 
the  membership  protocol:  however,  a few  more  explanations  should  be  made 
before  the  output  of  that  program  Is  shown.  There  Is  one  exception  to  the  rule 


160. 


Appendix  A:  Correctness  of  the  ¥embershlp  Protocol 


r 


that  tha  protocol  tester  records  all  messages  In  each  message  queue  individually  In 
the  order  in  which  they  were  sent.  Since  sending  any  number  of  consecutive  R- 
messages  has  the  same  effect  on  the  sender’s  state  as  sending  a single  one,  any 
sequence  of  consecutive  R-  messages  In  a queue  is  represented  as  a single  R- 
message  (It  Is  actually  this  compression  which  makes  the  number  of  accessible  net- 
work states  finite).  As  a result,  transitions  triggered  by  the  receipt  of  an  R-  mes- 
sage can  happen  In  two  ways:  accompanied  by  the  removal  of  the  R-  message 
(the  proper  action  If  It  was  a single  message)  or  without  disturbing  the  R-  message 
(the  effect  of  only  removing  the  first  from  a stream  of  several).  The  protocol 
tester  must  use  both  methods  to  ensure  that  all  accessible  states  are  found. 

The  second  shortcut  used  by  the  protocol  tester  was  based  on  the  observa- 
tion that  the  two-node  "network"  under  consideration  Is  symmetrical.  Since  the  set 
of  starting  states  exhibits  the  same  symmetry,  this  symmetry  could  be  used  to 
reduce  by  half  the  number  of  network  states  actually  considered.  Although  this 
technique  was  used  to  speed  things  up  during  the  protocol  design,  the  tabulation 
below  will  be  presented  without  taking  advantage  of  this  compression,  to  make  It 
more  usable  as  a reference. 

The  representation  used  for  a network  state  was 

(A-state  AB-queue  B-state  BA-queue) 

Within  a queue,  the  rightmost  element  was  the  first  sent  (and  will  be  the  first 
received).  Thus  the  network  state 

(M?  (R-  R+)  X 0) 

means  that  processor  A has  sent  an  R*  message  followed  by  some  number  of  R- 
messages  and  Is  now  in  state  BI7.  Processor  B has  not  yet  received  any  of  these 


Section  A.2;  Testing  the  Membership  Protocol 


161. 


massagaa.  has  not  sant  any  maasagas  that  hava  not  boon  racalvad  by  A,  and  Is 
currantly  In  stata  X. 

Tha  aingia  Initial  natwork  stata  usad  was 

(X  0 X 0) 

which  led  to  all  tha  others  shown  below.  The  network  states  are  shown  sorted, 
Ifrst  by  the  state  of  processor  A,  then  by  tha  stata  of  processor  B,  then  by  the 
contents  of  the  message  queues.  Each  Una  in  tha  following  tabulation  contains  a 
refaranca  number,  a possible  network  state,  the  rafaranca  number  of  a previous 
stata  that  could  lead  to  that  stata,  tha  transition  rule  that  would  bo  used  to  go 
from  that  previous  stata  to  the  currant  state,  and  tha  processor  to  which  the  rule 
would  be  applied. 


162. 


Appendix  A:  Corractnasa  of  tha  Mawbarshlp  Protocol 


Membership  Protocol  Network  States 


Transition 

Applied 

Number  State 

Previous 

Rule 

To 

1. 

(M  0 S 0) 

61 

(M?  * nil  M) 

A 

2. 

(M  (R-)  S 0) 

62 

(M?  -t-  nil  M) 

A 

3. 

(M  (R-  A+)  S?  0) 

4 

(M  nil  R-  M) 

A 

4. 

(M  (A*)  S7  0) 

699 

(X!  A+  A+  M) 

A 

6. 

(M  (R-  A+  R-)  S?  0) 

6 

(M  nil  R-  M) 

A 

6. 

(M  (A+  R-)  S?  0) 

601 

(X!  A-t-  A-i-  M) 

A 

7. 

(M  0 SR  0) 

8 

(M  R-  nil  M) 

A 

8. 

(M  0 SR  (R-)) 

1 

(S  nil  R-  SR) 

B 

9. 

(M  (R-)  SR  0) 

10 

(M  R-  nil  M) 

A 

10. 

(M  (R-)  SR  (R-)) 

2 

(S  nil  R-  SR) 

B 

11. 

(M  (R-  A+)  SR?  0) 

12 

(M  R-  nil  M) 

A 

12. 

(M  (R-  A+)  SR?  (R-)) 

3 

(S?  nil  R-  SR?) 

B 

13. 

(M  (A+)  SR?  0) 

14 

(M  R>  nil  M) 

A 

14. 

(M  (A+)  SR?  (R-)) 

4 

(S?  nil  R-  SR?) 

B 

16. 

(M  (R-  A*  R-)  SR?  0) 

16 

(M  R-  nil  M) 

A 

16. 

(M  (R-  A+  R-)  SR?  (R-)) 

6 

(S?  nil  R-  SR?) 

B 

17. 

(M  (A+  R-)  SR?  0) 

18 

(M  R-  nil  M) 

A 

18. 

(M  (A+  R-)  SR?  (R-)) 

6 

(S?  nil  R-  SR?) 

B 

10. 

(M?  (R-  R+)  M?  (R-  R+)) 

20 

(M?  nil  R-  M?) 

B 

20. 

(M?  (R-  R*)  M?  (R+)) 

43 

(N  nil  R-^  M7) 

B 

21. 

(M?  (R+)  M?  (R-  R*)) 

22 

(M?  nil  R-  M7) 

B 

22. 

(M?  (R+)  M?  (R+)) 

44 

(N  nil  R-f  M?) 

B 

23. 

(M?  0 M?1  (R-  - R+)) 

24 

(M?1  nil  R-  M?1) 

B 

24. 

(M?  0 M?1  (-  R+)) 

22 

(M?  R+  - M71) 

B 

26. 

(M?  0 M?1  (R-  - R-  R+)) 

26 

(M?1  nil  R-  M?1) 

B 

26. 

(M?  0 M?1  (-  R-  R+)) 

21 

(M?  R+  - M?1) 

B 

27. 

(M?  (R-)  M?1  (R-  - R+)) 

28 

(M?1  nil  R-  M?1) 

B 

28. 

(M?  (R-)  M?1  (-  R+)) 

20 

(M?  R+  - M?1) 

B 

29. 

(M?  (R-)  M?1  (R-  - R-  R+)) 

30 

(M?1  nil  R-  M?1) 

B 

30. 

(M?  (R-)  M?1  (-  R-  R+)) 

10 

(M?  R+  - M?1) 

B 

31. 

(M7  0 M?2  (R-  R+  -)) 

32 

(M?2  nil  R-  M?2) 

B 

32. 

(M?  0 M?2  (R+  -)) 

46 

(N?1  nil  R+  M?2) 

B 

33. 

(M?  (R-  R+  A-)  M?2  (R-  R+)) 

34 

(M?2  nil  R-  M72) 

B 

34. 

(M?  (R-  R+  A-)  M?2  (R+)) 

46 

(N?1  nil  R«  M?2) 

B 

36. 

(M?  (R+  A-)  M?2  (R-  R+)) 

36 

(M?2  nil  R-  M?2) 

B 

36. 

(M?  (R+  A-)  M?2  (R+)) 

47 

(N?1  nil  R*  M?2) 

B 

37. 

(M?  (R-  R*  A-  R-)  M?2  (R-  R+)) 

38 

(M?2  nil  R-  M?2) 

B 

38. 

(M?  (R-  R+  A-  R-)  M?2  (R+)) 

48 

(N?1  nil  R-^  M?2) 

B 

39. 

(M?  (R+  A-  R-)  M?2  (R-  R+)) 

40 

(M?2  nit  R-  M?2) 

B 

40. 

(M?  (R+  A-  R-)  M?2  (R+)) 

40 

(N?1  nil  R«  M?2) 

B 

41. 

(M?  (R-)  M?2  (R-  R*  -)) 

42 

(M?2  nil  R-  M?2) 

B 

42. 

(M?  (R-)  M?2  (R+  -)) 

60 

(N?1  nil  R4-  M72) 

B 

43. 

(M?  (R-  R*)  N 0) 

66 

(X  nil  nil  N) 

B 

44. 

(M?  (R+)  N 0) 

66 

(X  nil  nil  N) 

B 

46. 

(M?  0 N?1  (-)) 

44 

(N  R*  - N?1) 

B 

Sactlon 

A.2;  Testing  the  Membership  Protocol 

163. 

Membership  Protocol  Network  States 


Transition 

Applied 

Numbei 

State 

Previous 

Pule 

To 

40. 

(M7  (R-  R+  A-)  N?1  0) 

47 

(M?  nil  R-  M7) 

A 

47. 

(M7  (R*  A-)  N71  0) 

252 

(N  nil  R+  M?) 

A 

48. 

(M7  (R-  R*  A-  R-)  N71  ()) 

49 

(M7  nil  R-  ^*7) 

A 

49. 

(M7  (R+  A-  R-)  N71  ()) 

263 

(N  nil  R*  IM7) 

A 

60. 

(M7  (R-)  N71  (-)) 

43 

(N  R+  - N71) 

B 

61. 

(M7  0 S (♦)) 

66 

(X  R+  + S) 

B 

62. 

(M7  (R-)  S (♦)) 

66 

(X  R-i-  ♦ S) 

B 

63. 

(M7  0 SR  (R-  ♦)) 

61 

(S  nil  R-  SR) 

B 

64. 

(M7  (R-)  SR  (R-  4^)) 

62 

(S  nil  R-  SR) 

B 

66. 

(M7  m-  R+)  X 0) 

66 

(M7  nil  R-  11*7) 

A 

66. 

(M7  (R+)  X 0) 

264 

(N  nit  R*  N17) 

A 

67. 

(M71  (R-  - R*)  M7  0) 

69 

(•*71  nil  R-  M71) 

A 

58. 

(M71  (R-  - R+)  M7  (R-)) 

67 

(•*7  nil  R-  IM7) 

B 

69. 

(M71  (-  R+)  M7  0) 

22 

(•*7  R+  - 11*71) 

A 

60. 

(IKI71  (-  R+)  1*7  (R-)) 

69 

(•*7  nil  R-  M?) 

B 

61. 

(M71  (R-  - R-  R+)  M7  ()) 

63 

(•*71  nil  R-  11*71) 

A 

62. 

(M71  (R-  - R-  R+)  M7  (R-)) 

61 

(•*7  nil  R-  **7) 

B 

63. 

(M71  (-  R-  R4)  M7  0) 

20 

(•*7  R+  - 11*71) 

A 

04. 

(M71  (-  R-  R+)  M7  (R-)) 

63 

(M7  nil  R>-  M7) 

B 

06. 

(M71  (R-  -)  M71  (R-  -)) 

06 

(•*71  nil  R-  M71) 

B 

60. 

(M71  -)  M71  (-)) 

67 

(•*7  R+  - **71) 

B 

67. 

(M71  (R-  -)  M71  (R-  - R-)) 

68 

(•*71  nil  R-  11*71) 

B 

68. 

(M71  (R-  -)  M71  (-  R-)) 

68 

(•*7  R*  - M?1) 

B 

09. 

(M71  (-)  M71  (R-  -)) 

70 

(•*71  nil  R-  **71) 

B 

70. 

(M71  (-)  M71  (-)) 

69 

(•*7  R+  - 11*71) 

B 

71. 

(M71  (-)  M71  (R-  - R-)) 

72 

(•*71  nH  R-  IM71) 

B 

72. 

(M71  (-)  M71  (-  R-)) 

60 

(•*7  R4>  - IM71) 

B 

73. 

(M71  (R-  - R-)  1*71  (R-  -)) 

74 

(M71  nil  R-  11*71) 

B 

74. 

(M71  (R-  - R-)  M71  (-)) 

61 

(•*7  R+  - 11*71) 

B 

76. 

(M71  (R-  - R-)  1*71  (R-  - R-)) 

76 

(•*71  nil  R-  M71) 

B 

76. 

(•*71  (R-  - R-)  **71  (-  R-)) 

62 

(•*7  R-r  - IM71) 

B 

77. 

(•*71  (-  R-)  •*71  (R-  -)) 

78 

(•*71  nil  B-  IM71) 

B 

78. 

(•*71  (-  R-)  **71  (-)) 

63 

(M7  R4  - 11*71) 

B 

79. 

(•*71  (-  R-)  11*71  (R-  - R-)) 

80 

(•*71  nil  R-  11*71) 

B 

80. 

(•*71  (-  R-)  11*71  (-  R-)) 

64 

(•*7  R-r  - IM71) 

B 

81. 

(•*71  0 11*72  (R-  R+  A-  -)) 

82 

(•*72  nil  R-  MI72) 

B 

82. 

(•*71  0 IM72  (R<4  a-  -)) 

113 

(N71  nil  R4-  IM72) 

B 

83. 

(M71  0 M72  (R-  R+  A-  R-  -)) 

84 

(•*72  nil  R-  IM72) 

B 

84. 

(•*71  0 **72  (R+  A-  R-  -)) 

114 

(N71  nH  R<<>  IM72) 

B 

86. 

(M71  O NI72  (R-  ft*  A-  - R-)) 

86 

(M72  nil  R-  IM72) 

B 

86. 

(•*71  ()  11*72  (R*  A-  - R-)) 

116 

(N71  nil  R4^  11*72) 

B 

87. 

(•*71  0 **72  (R-  R*  A-  R-  - R-)) 

88 

(M72  nil  R-  M72) 

B 

88. 

(•*71  0 **72  (R+  A-  R-  - R-)) 

116 

(N71  nU  R*  MI72) 

B 

89. 

(•*71  (R-  - R*  A-)  IM72  ()) 

91 

(M71  nil  R-  11*71) 

A 

90. 

(•*71  (R-  - R+  A-)  11*72  (R-)) 

89 

(•*72  nil  R-  IM72) 

B 

64. 

Appandlx  A: 

Corractnass  of  tha  NIanibaraMp  Protocol 

I 


Membership  Protocol  Network 


Number  State 

Previous 

91. 

(M71  (-  R+  A-)  M?2  0) 

36 

92. 

(M?1  (-  R+  A-)  M72  (R-)) 

91 

93. 

(M71  (R-  - R-  R+  A-)  M72  ()) 

96 

94. 

(M71  (R-  - R-  R+  A-)  M72  (R-)) 

93 

96. 

(M71  (-  R-  R+  A-)  M72  ()) 

34 

96. 

(M71  (-  R-  R+  A-)  M72  (R-)) 

96 

97. 

(M71  (R-  - R+  A-  R-)  M72  ()) 

99 

98. 

(M71  (R-  - R+  A-  R-)  M72  (R-)) 

97 

99. 

(M71  (-  R+  A-  R-)  M72  ()) 

40 

100. 

(M71  (-  R+  A-  R-)  M72  (R-)) 

99 

101. 

(M71  (R-  - R-  R+  A-  R-)  M72  ()) 

103 

102. 

(M71  (R-  - R-  R+  A-  R-)  M72  (R-)) 

101 

103. 

(M71  (-  R-  R+  A-  R-)  M72  ()) 

38 

104. 

(M71  (-  R-  R+  A-  R-)  M72  (R-)) 

103 

106. 

(M71  (R-)  M72  CR-  R+  A-  -)) 

106 

106. 

(M71  (R-)  M72  (R*  A-  -)) 

117 

107. 

(M71  (R-)  M72  (R-  R+  A-  R-  -)) 

108 

108. 

(M71  (R-)  M72  (R+  A-  R-  -)) 

118 

109. 

(M71  (R-)  M72  (R-  R+  A-  - R-)) 

110 

110. 

(M71  (R-)  M72  (R+  A-  - R-)) 

119 

111. 

(M71  (R-)  M72  (R-  R*  A-  R-  - R-)) 

112 

112. 

(M71  (R-)  M72  (Rf  A-  R-  - R-)) 

120 

113. 

(M71  0 N71  (A-  -)) 

70 

114. 

(M71  0 N71  (A-  R-  -)) 

69 

116. 

(M71  0 N71  (A-  - R-)) 

72 

116. 

(M71  0 N71  (A-  R-  - R-)) 

71 

117. 

(M71  (R-)  N71  (A-  -)) 

66 

118. 

. (M71  (R-)  N71  (A-  R-  -)) 

66 

119. 

(M71  (R-)  N71  (A-  - R-)) 

68 

120. 

(M71  (R-)  N71  (A-  R-  - R-)) 

67 

121. 

(M72  (R-  R+  -)  M7  ()) 

123 

122. 

(M72  (R-  R+  -)  M7  (R-)) 

121 

123. 

(M72  (R+  -)  M7  0) 

309 

124. 

(M72  (R+  -)  M7  (R-)) 

123 

126. 

(M72  (R-  R+)  M7  (R-  R+  A-)) 

126 

126. 

(M72  (R-  R+)  M7  (R*  A-)) 

197 

127. 

(M72  (R-  R+)  M7  (R-  R+  A-  R-)) 

128 

128. 

(M72  (R-  R+)  M7  (R*  A-  R-)) 

198 

129. 

(M72  (R+)  M7  (R-  R+  A-)) 

130 

(M72  (R+)  M7  (R+  A-)) 

199 

131. 

(M72  (R*)  M7  (R-  R+  A-  R-)) 

132 

132. 

(M72  (R+)  M7  (R+  A-  R-)) 

200 

133. 

(M72  0 M71  (R-  - R+  A-)) 

134 

134. 

(M72  0 M71  (-  R*  A-)) 

130 

136. 

CM72  0 M71  (R-  - R-  R4  A-)) 

136 

Saction  A.2:  Testing  the  Membership  Protocol 


Transition 

Rule 


Applied 

To 


(M7  R+  - M71) 

A 

(M72  nil  R-  M72) 

B 

(M71  nil  R-  M71) 

A 

(M72  nil  R-  M72) 

B 

(M7  R+  - M71) 

A 

(M72  nil  R-  M72) 

B 

(M71  nil  R-  M71) 

A 

(M72  nil  R-  M72) 

B 

(M7  R4  - M71) 

A 

(M72  nil  R-  M72) 

B 

(M71  nil  R-  M71) 

A 

(M72  nil  R-  M72) 

B 

(M7  R4  - M71) 

A 

(M72  nil  R-  M72) 

B 

(M72  nil  R-  M72) 

B 

(N71  nil  R4  M72) 

B 

(M72  nil  R-  M72) 

B 

(N71  nil  R+  M72) 

B 

(M72  nil  R-  M72) 

B 

(N71  nil  R+  M72) 

B 

(M72  nil  R-  M72) 

B 

(N71  nil  R4  M72) 

B 

(M71  - A-  N71) 

B 

(M71  - A-  N71) 

B 

(M71  - A-  N71) 

B 

(M71  - A-  N71) 

B 

(M71  - A-  N71) 

B 

(M71  - A-  N71) 

B 

(M71  - A-  N71) 

B 

(M71  - A-  N71) 

B 

(M72  nil  R-  M72) 

A 

(M7  nil  R-  M7) 

B 

(N71  nil  R4  M72) 

A 

(M7  nil  R-  M7) 

B 

(M7  nil  R-  M7) 

B 

(N  nil  R4  M7) 

B 

(M7  nil  R-  M7) 

B 

(N  nil  R4  M7) 

B 

(M7  nil  R-  M7) 

B 

(N  nil  R4  M7) 

B 

(M?  nil  R-  M7) 

B 

(N  nil  R4  M7) 

B 

(M71  nil  R-  M71) 

B 

(M?  R4  - M?1) 

B 

(M71  nil  R-  M71) 

B 

166. 


r 


r 


4 

i 

I 

Mmbership  Protocol  Network  States 


} 

Transition 

Applied 

Number  State 

Previous 

Rule 

To 

’ 136. 

(M?2  0 M71  (-  R-  R+  A-)) 

129 

(M7  Rt-  - M?1) 

B 

! 137. 

(M72  0 M71  (R-  - R+  A-  R-)) 

138 

(M?1  nil  R-  M71) 

B 

138. 

(M72  0 M71  (-  R+  A-  R-)) 

132 

(M7  R+  - M71) 

B 

130. 

(M72  0 M71  (R-  - R-  R+  A-  R-)) 

140 

(M71  nil  R-  M71) 

B 

1 140. 

(M72  0 M71  (-  R-  R+  A-  R-)) 

131 

(M7  R+  - M?1) 

B 

141. 

(M72  (R-  R+  A-  -)  M71  ()) 

143 

(M72  nil  R-  M72) 

A 

142. 

(M72  (R-  R+  A-  -)  M71  (R-)) 

141 

(M71  nil  R-  M71) 

B 

143. 

(M72  (R+  A-  -)  M?1  0) 

311 

(N71  nil  R+  M72) 

A 

144. 

(M72  (R+  A-  -)  M71  (R-)) 

143 

(M71  nil  R-  M71) 

B 

145. 

(M72  (R-  R+  A-  R-  -)  M71  ()) 

147 

(M72  nil  R-  M72) 

A 

146. 

(M72  (R-  R+  A-  R-  -)  M71  (R-)) 

146 

(M71  nil  R-  M71) 

B 

147. 

(M72  (R+  A-  R-  -)  M?1  ()) 

313 

(N71  nil  R-*-  M72) 

A 

148. 

(M72  (R+  A-  R-  -)  M71  (R-)) 

147 

(M71  nil  R-  M71) 

B 

149. 

(M72  (R-  R+  A-  - R-)  M71  ()) 

161 

(M72  nil  R-  M72) 

A 

160. 

(M72  (R-  R+  A-  - R-)  M71  (R-)) 

149 

(M71  nil  R-  M71) 

B 

161. 

(M72  (R+  A-  - R-)  M71  ()) 

316 

(N?1  nil  R*  M72) 

A 

152. 

(M72  (R+  A-  - R-)  M71  (R-)) 

161 

(M71  nil  R-  M71) 

B 

163. 

(M72  (R-  R+  A-  R-  - R-)  M71  ()) 

166 

(M72  nil  R-  M72) 

A 

164. 

(M72  (R-  R+  A-  R-  - R-)  M71  (R-)) 

163 

(M71  nil  R-  M71) 

B 

166. 

(M72  (R+  A-  R-  - R-)  M71  ()) 

317 

(M71  nil  R+  M72) 

A 

166. 

(M72  (R+  A-  R-  - R-)  M71  (R-)) 

166 

(M71  nil  R-  M71) 

B 

167. 

(M72  (R-)  M71  (R-  - R+  A-)) 

168 

(M71  nil  R-  M71) 

B 

168. 

(M72  (R-)  M71  (-  R+  A-)) 

126 

(M7  R+  - M71) 

B 

169. 

(M72  (R-)  IIC71  (R-  - R-  R+  A-)) 

160 

(M71  nil  R-  M71) 

B 

160. 

(M72  CR-)  IW71  (-  R-  R*  A-)) 

126 

(M7  R+  - M71) 

B 

161. 

162 

(M71  nil  R-  M71) 

B 

162. 

128 

(M7  R+  - M71) 

B 

163. 

164 

(M71  nil  R-  M71) 

B 

164. 

127 

(M7  R*  - M71) 

B 

166. 

(1472  p M72  (R-  R+  - A-)) 

166 

(M72  nil  R-  M72) 

B 

166. 

(M72  0 M72  (R+  - A-)) 

206 

(N71  nil  R*  M72) 

B 

167. 

(M72  0 M72  (R-  R+  - A-  R-)) 

168 

(M72  nil  R-  M72) 

B 

168. 

(M72  0 M72  (R+  - A-  R-)) 

206 

(N71  nil  R*  M72) 

B 

169. 

(M72  (R-  R*  - A-)  M72  ()) 

171 

(M72  nil  R-  M72) 

A 

170. 

(M72  (R-  R+  - A-)  M72  (R-)) 

160 

(M72  nil  R-  M72) 

B 

171. 

(M72  (R+  - A-)  M72  ()) 

310 

(M71  nil  R*  M72) 

A 

172. 

(M72  (R+  - A-)  M72  (R-)) 

171 

(M72  nil  R-  M72) 

B 

173. 

(M72  (R-  R+  A-)  M72  (R-  R+  A-)) 

174 

(M72  nil  R-  M72) 

B 

174. 

(M72  (R-  R+  A-)  M72  (fl+  A-)) 

207 

(N71  nil  R+  M72) 

B 

176. 

(M72  (R-  R4  A-)  M72  (R-  R+  A-  R-)) 

176 

(11172  nil  R-  M72) 

B 

176. 

(M72  (R-  Ro  A-)  M72  (R«  A-  R-)) 

208 

(N?1  nil  R-*>  M72) 

B 

177. 

(M72  (R*  A-)  M72  (R-  R+  A-)) 

178 

(M72  nil  R-  IN72) 

B 

178. 

(M72  (R4  A-)  M72  (R«  A-)) 

200 

(N71  nil  Rt^  M72) 

B 

170. 

(M72  (R*  A-)  M72  (R-  R+  A-  R-)) 

180 

(M72  nil  R-  M72) 

B 

180. 

(M72  (R4^  A-)  M72  (Rf  A-  R-)) 

210 

(N71  nil  R*  M72) 

B 

166. 

Appendix  A:  CorrpctnPM 

of  tiM  Mmnborshlp 

Protocol 

Membership  Protocol  Network  States 


Number  State 

181.  (M72  (R- R+ - A- R-)  M?2  0) 

182.  (M72  (R-  R*  - A-  R-)  M72  (R-)) 

183.  (M72  (R+ - A- R-)  M72  0) 

184.  (M72  (R+  - A-  R-)  M72  (R-)) 

186.  (M72  (R-  R+  A-  R-)  M72  (R-  R+  A-)) 

Previous 

183 

181 

326 

183 

186 

Transition 

Rule 

(M72  nil  R-  M72) 
(M72  nil  R-  M72) 
(N71  nil  R-f  M72) 
(M72  nil  R-  M72) 
(M72  nil  R-  M72) 

Apf^led 

To 

A 

B 

A 

B 

B 

186. 

(M72  (R-  R+  A-  R-)  M72  (R+  A-)) 

211 

(N71  nil  R<i>  M72) 

B 

187. 

(M72  (R-  R+  A-  R-)  M72  (R-  R+  A-  R-))  188 

(M72  nil  R-  M72) 

B 

188. 

(M72  (R-  R+  A-  R-)  M72  (R+  A-  R-)) 

212 

(N71  nil  R+  M72) 

B 

189. 

(M72  (R+  A-  R-)  M72  (R-  R+  A-)) 

190 

(M72  nil  R-  M72) 

B 

190. 

(M72  (R+  A-  R-)  M72  (R+  A-)) 

213 

(N71  nil  R4-  M72) 

B 

191. 

(M72  (R+  A-  R-)  M72  (R-  R+  A-  R-)) 

192 

(M72  nil  R-  M72) 

B 

192. 

(M72  (R+  A-  R-)  M72  (R+  A-  R-)) 

214 

(N71  nil  R-i-  M72) 

B 

193. 

(M72  (R-)  M72  (R-  R+  - A-)) 

194 

(M72  nil  R-  M72) 

B 

194. 

(M72  (R-)  M72  (R*  - A-)) 

215 

(N71  nil  R4-  M72) 

B 

196. 

(M72  (R-)  M72  (R-  R+  - A-  R-)) 

196 

(M72  nil  R-  M72) 

B 

196. 

(M72  (R-)  M72  (R+  - A-  R-)) 

216 

(N71  nil  R-^  M72) 

B 

197. 

(M72  (R-  R+)  N (A-)) 

121 

(M7  - A-  N) 

B 

198. 

(M72  (R-  R+)  N (A-  R-)) 

122 

(M7  - A-  N) 

B 

199. 

(M72  (R-r)  N (A-)) 

123 

(M7  - A-  N) 

B 

200. 

(M72  (R+)  N (A-  R-)) 

124 

(M7  - A-  N) 

B 

201. 

(M72  (R-  R+  A-)  N7  ()) 

241 

(X7  nil  nil  N7) 

B 

202. 

(M72  (R-  R*  A-)  N7  (R-)) 

201 

(N7  nil  R-  N7) 

B 

203. 

(M72  (R-i-  A-)  N7  ()) 

243 

(X7  nil  nil  N7) 

B 

204. 

(M72  (R*  A-)  M7  (R-)) 

203 

(N7  nil  R-  N7) 

B 

206. 

(M72  0 N71  (-  A-)) 

199 

(N  R+  - N71) 

B 

206. 

(M72  0 N71  (-  A-  R-)) 

200 

(N  R+  - N71) 

B 

207. 

(M72  (R-  R+  A-)  N71  (A-)) 

141 

(M71  - A-  N71) 

B 

208. 

(M72  (R-  Ri-  A-)  N71  (A-  R-)) 

142 

(M71  - A-  N71) 

B 

209. 

(M72  (R+  A-)  N71  (A-)) 

143 

(M71  - A-  N71) 

B 

210. 

(M72  (R+  A-)  N71  (A-  R-)) 

144 

(M71  - A-  N71) 

B 

211. 

(M72  (R-  R+  A-  R-)  N71  (A-)) 

146 

(M71  - A-  N71) 

B 

212. 

(M72  (R-  R*  A-  R-)  N71  (A-  R-)) 

146 

(M71  - A-  N71) 

B 

213. 

(M72  (R+  A-  R-)  N71  (A-)) 

147 

(M71  - A-  N71) 

B 

214. 

(M72  (R+  A-  R-)  N71  (A-  R-)) 

148 

(M71  - A-  N71) 

B 

216. 

(M72  (R-)  N71  (-  A-)) 

197 

(N  R+  - N71) 

B 

216. 

(M72  (R-)  N71  (-  A-  R-)) 

198 

(N  R+  - N71) 

B 

217. 

(M72  0 S (♦  A-)) 

239 

(X  R+  ♦ S) 

B 

218. 

(M72  0 S (♦  A-  R-)) 

240 

(X  R*  * S) 

B 

219. 

(M72  (R-)  S (♦  A-)> 

237 

(X  R+  ♦ S) 

B 

220. 

(M72  (R-)  S (♦  A-  R-)) 

238 

(X  R+  ♦ S) 

B 

221. 

(M72  (R-  R+  A-)  S7  ()) 

222 

(M72  nil  R-  M72} 

A 

222. 

(M72  (R+  A-)  S7  0) 

339 

(N71  nil  R-*'  M72) 

A 

223. 

(M72  (R-  R+  A-  R-)  S7  ()) 

224 

(M72  nil  R-  M72) 

A 

224. 

(M72  (R*  A-  R-)  S7  ()) 

340 

(N71  nil  R«  M72) 

A 

226. 

(M72  0 SR  (R-  ♦ A-)) 

217 

(S  nil  R-  SR) 

B 

Section  A.2:  Testing  the  Membership  Protocol  167. 


Protocol  Network  States 


Trensitlon 


ll 

Number  State 

Previous 

Pule  f 

i! 

226. 

(M?2  0 SR  (R-  * A-  R-)) 

218 

(S  nil  R-  SR) 

227. 

(M72  (R-)  SR  (R-  ♦ A-)) 

210 

(S  nil  R-  SR) 

228. 

(M?2  (R-)  SR  (R-  ♦ A-  R-)) 

220 

(S  nil  R>  SR) 

220. 

(M72  (R-  R+  A-)  SR7  ()) 

230 

(M72  R-  nil  M72) 

230. 

(M72  (R-  R+  A-)  SR7  (R-)) 

221 

(S7  nil  R-  SR?) 

231. 

(M72  (R+  A-)  SR7  ()) 

232 

(M72  R-  nil  M72) 

232. 

(M72  (Rf  A-)  SR7  (R-)) 

222 

(S?  nil  R-  SR?) 

233. 

(M72  (R-  R+  A-  R-)  SR7  ()) 

234 

(M72  R-  nil  M72) 

234. 

(M72  (R-  R+  A-  R-)  SR7  (R-)) 

223 

(S?  nil  R-  SR?) 

236. 

(M72  (R+  A-  R-)  SR7  0) 

236 

(M?2  R-  nil  M?2) 

236. 

^M72  (R+  A-  R-i  SR7  <R-)) 

224 

(S?  nil  R-  SR?) 

237. 

(M72  (R-  Ro)  X (A-)) 

107 

(N  nil  nil  X) 

238. 

(M72  (R-  R+)  X (A-  R-)) 

108 

(N  nil  nil  X) 

230. 

(M72  (R+)  X (A-)) 

100 

(N  nil  nil  X) 

240. 

(M72  (R4-)  X (A-  R-)) 

200 

(N  nil  nil  X) 

Applied 


246. 

(N  0 M?  (R+)) 

261 

(N  nil  R4  M?) 

247. 

(N  (A-)  M?2  (R-  R+)) 

248 

(M?2  nil  R-  M?2) 

248. 

(N  (A-)  M?2  (R«)) 

262 

(N71  nil  R+  M?2) 

240. 

(N  (A-  R-)  M?2  (R-  R+)) 

260 

(M?2  nil  R-  M?2) 

1 260. 

(N  (A-  R-)  M?2  (R-f)) 

263 

(N?1  nil  R4>  M?2) 

} 261. 

(N  0 N 0) 

264 

(X  nil  nil  N) 

! 262. 

(N  (A-)  N?1  0) 

46 

(M?  - A-  N) 

263. 

(N  (A-  R-)  N?1  0) 

60 

(M?  - A-  N) 

264. 

(N  0 X 0) 

608 

(X  nil  nil  N) 

266. 

(N?  p M?2  (R-  R+  A-)) 

266 

(M?2  nil  R-  M?2) 

266. 

(N?  0 M?2  (R+  a-)) 

260 

(N?1  nil  P*  M?2) 

267. 

(N?  (R-)  M?2  (R-  R+  A-)) 

268 

(M?2  nil  R-  M?2) 

268. 

(N?  (R-)  M?2  (R*  A-)) 

260 

(N?1  nil  R-r  M?2) 

260. 

(N?  0 II71  (A-)) 

262 

(S  - A-  N?1) 

1 260. 

(N?  (R-)  N?1  (A-)) 

261 

(S  - A-  N?1) 

266. 

(N?  6 8?  (A+  R-)) 

280 

(8R  - A+  S?) 

8 

267. 

(N?  (R-  - A+)  S?  0) 

268 

(N?  nil  R-  N?) 

A 

268. 

(N?  (-  A+)  S?  0) 

644 

(X?  nil  nil  N?) 

A 

260. 

(N?  (R-  - R-  A+)  S?  0) 

270 

(M?  nil  R-  N?) 

A 

270. 

(N?  (-  R-  A*)  8?  0) 

646 

(X?  nil  nil  N?) 

A 

Appendix  A:  Corrnetness  of  thn  MmnborcMp  Protocol 


Membership  Protocol  Network  States 


Transition  Applied 


Number  State 

Previous 

Rule 

To 

271. 

(N?  (R-  - A+  R-)  S?  0) 

272 

(N7  nil  R-  N7) 

A 

272. 

(N7  (-  A+  R-)  S7  0) 

648 

(X7  nil  nil  N7) 

A 

273. 

(N7  (R-  - R-  A+  R-)  S7  ()) 

274 

(N7  nil  R-  N7) 

A 

274. 

(N7  (-  R-  A+  R-)  S7  0) 

660 

(X7  nil  nil  N7) 

A 

276. 

(N7  (R-)  S7  (A+)) 

277 

(SR  - A+  S7) 

B 

276. 

(N7  (R-)  S7  (A+  R-)) 

278 

(SR  - A+  S7) 

B 

277. 

(N7  (R-  -)  SR  0) 

278 

(N7  R-  nil  N7) 

A 

278. 

(N7  (R-  -)  SR  (R-)) 

261 

(S  nil  R-  SR) 

B 

279. 

(N7  (-)  SR  0) 

280 

(N7  R-  nil  N7) 

A 

280. 

(N7  (-)  SR  (R-)) 

262 

(S  nil  R-  SR) 

B 

281. 

(N7  (R-  - R-)  SR  0) 

282 

(N7  R-  nil  N?) 

A 

282. 

(N7  (R-  - R-)  SR  (R-)) 

263 

(S  nil  R-  SR) 

B 

283. 

(N7  (-  R-)  SR  0) 

284 

(N7  R-  nil  N7) 

A 

284. 

(N7  (-  R-)  SR  (R-)) 

264 

(S  nil  R-  SR) 

B 

286. 

(N7  0 SR7  (R-  A+)) 

266 

(S7  nil  R-  SR7) 

B 

286. 

(N7  0 SR7  (R-  A+  R-)) 

266 

(S7  nil  R-  SR7) 

B 

287. 

(N7  (R-  - A+)  SR7  0) 

268 

(N7  R-  nil  N7) 

A 

288. 

(N7  (R-  - A+)  SR7  (R-)) 

267 

(S7  nil  R-  SR7) 

B 

289. 

(N7  (-  A+)  SR7  0) 

290 

(N?  R-  nil  N7) 

A 

290. 

(N7  (-  A+)  SR7  (R-)) 

268 

(S7  nil  R-  SR7) 

B 

291. 

(N7  (R-  - R-  A+)  SR7  ()) 

292 

(N7  R-  nil  N7) 

A 

292. 

(N7  (R-  - R-  A+)  SR7  (R-)) 

269 

(S7  nil  R-  SR7) 

B 

293. 

(N7  (-  R-  A+)  SR7  0) 

294 

(N7  R-  nil  N7) 

A 

294. 

(N7  (-  R-  A+)  SR7  (R-)) 

270 

(S7  nil  fl-  SR7) 

B 

296. 

(N7  (R-  - A+  R-)  SR7  ()) 

296 

(N7  R-  nil  N7) 

A 

296. 

(N7  (R-  - A+  R-)  SR7  (R-)) 

271 

(S7  nil  R-  SR7) 

B 

297. 

(N7  (-  A+  R-)  SR7  0) 

298 

(N7  R-  nil  N7) 

A 

298. 

(M7  (-  A+  R-)  SR7  (R-)) 

272 

(S7  nil  R-  SR7) 

B 

299. 

(N7  (R-  - R-  A+  R-)  SR7  ()) 

300 

(N7  R-  nil  N7) 

A 

300. 

(N7  (R-  - R-  A+  R-)  SR7  (R-)) 

273 

(S7  nil  R-  SR7) 

B 

301. 

(N7  (-  R-  A+  R-)  SR7  ()) 

302 

(N7  R-  nil  N7) 

A 

302. 

(N7  (-  R-  A+  R-)  SR7  (R-)) 

274 

(S7  nil  R-  SR7) 

B 

303. 

(N7  (R-)  SR7  (R-  A+)) 

276 

(S7  nil  R-  SR7) 

B 

304. 

(N7  (R-)  SR7  (R-  A+  R-)) 

276 

(S7  nil  R-  SR7) 

B 

306. 

(N71  0 M7  (R-  R+  A-)) 

306 

(M7  nil  R-  M7) 

B 

306. 

(N71  0 M7  (R+  A-)) 

331 

(N  nil  R*  M7) 

B 

307. 

(N71  0 M7  (R-  R+  A-  R-)) 

308 

(M7  nil  R-  M7) 

B 

308. 

(N71  0 M7  (R+  A-  R-)) 

332 

(N  nil  R*  M7) 

B 

309. 

(N71  (-)  M7  0) 

246 

(N  R+  - N71) 

A 

310. 

(N71  (-)  M7  (R-)) 

309 

(M7  nil  R-  M7) 

B 

311. 

(N71  (A-  -)  M71  0) 

70 

(M71  - A-  N71) 

A 

312. 

(N71  (A-  -)  M71  (R-)) 

311 

(M71  nil  R-  M?1) 

B 

313. 

(N71  (A-  R-  -)  M71  0) 

66 

(M71  - A-  N71) 

A 

314. 

(N71  (A-  R-  -)  M71  (R-)) 

313 

(M71  nil  R-  M71) 

B 

316. 

(N71  (A-  - R-)  M71  0) 

78 

(M71  - A-  N71) 

A 

Section  A.2:  Testing  the  Membership  Protocol  168. 


Mtnbership  Protocol  Network  States 


Transition 

Applied 

Numbei 

State 

Previous 

Rule 

Ti 

316. 

(N71  (A-  - R-)  M71  (R-)) 

316 

(M?1  nil  R-  M?1) 

8 

317. 

(N71  (A-  R-  - R-)  M71  ()) 

74 

(M71  - A-  N71) 

A 

318. 

(N71  (A-  R-  - R-)  M71  (R-)) 

317 

(M71  nil  R-  M71) 

B 

310. 

(N71  (-  A-)  M72  0) 

248 

(N  R+  - N71) 

A 

320. 

(N?1  (-  A-)  M72  (R-)) 

319 

(M72  nil  R-  M?2) 

8 

321. 

(N?1  (A-)  M72  (R-  R+  A-)) 

322 

(M?2  nil  R-  M72} 

B 

322. 

(N71  (A-)  M?2  (R+  A-)) 

336 

(N71  nil  R*  M72) 

B 

323. 

(N71  (A-)  M72  (R-  R*  A-  R-)) 

324 

(M?2  nil  R-  M?2) 

B 

324. 

(N?1  (A-)  M72  (R+  A-  R-)) 

336 

(N71  nil  R+  M72) 

B 

326. 

(N71  (-  A-  R-)  M72  0) 

260 

(N  R+  - N?1) 

A 

326. 

(N71  (-  A-  R-)  M72  (R-)) 

326 

(M72  nil  R-  M72) 

B 

327. 

(N71  (A-  R-)  M72  (R-  R+  A-)) 

328 

(M72  nil  R-  M72) 

B 

328. 

(N71  (A-  R-)  M72  (R+  A-)) 

337 

(N71  nil  R*  M72) 

B 

329. 

(N71  (A-  R-)  M72  (R-  R+  A-  H-)) 

330 

(M72  nil  R-  M72) 

B 

330. 

(N71  (A-  R-)  M72  (R+  A-  R-)) 

338 

(N71  nil  R*  M72) 

B 

331. 

W71  0 N (A-)) 

300 

(M7  - A-  N) 

B 

332. 

(N71  0 N (A-  R-)) 

310 

(M?  - A-  N) 

B 

333. 

(N71  (A-)  N7  0) 

347 

(X7  nil  nil  N?) 

B 

334. 

(N71  (A-)  N7  (R-)) 

333 

(N7  nil  R-  N7) 

B 

336. 

(N71  (A-)  N71  (A-)) 

311 

(M71  - A-  N71) 

B 

336. 

(N71  (A-)  N71  (A-  R-)) 

312 

(M71  - A-  N71) 

B 

337. 

(N71  (A-  R-)  N71  (A-)) 

313 

(M71  - A-  N71) 

B 

338. 

(N71  (A-  R-)  N71  (A-  R-)) 

314 

(M71  - A-  N71) 

B 

330. 

(N71  (A-)  S7  0) 

266 

(N7  A+  A-  N71) 

A 

340. 

(N71  (A-  R-)  S7  0) 

275 

(N?  A+  A-  N71) 

A 

341. 

(N71  (A-)  SR7  0) 

342 

(N71  R-  nil  N71) 

A 

342. 

(N71  (A-)  SR7  (R-)) 

339 

(S?  nil  R-  SR7) 

B 

343. 

(N71  (A-  R-)  SR7  0) 

344 

(N71  R-  nil  N71) 

A 

344. 

(N71  (A-  R-)  SR7  (R-)) 

340 

(S?  nil  R-  SR?) 

B 

346. 

(N71  0 X (A-)) 

(N  nil  nil  X) 

B 

346. 

(N71  0 X (A-  R-)) 

332 

(N  nil  nil  X) 

B 

347. 

(N71  (A-)  X7  0) 

384 

(S  - A-  N71) 

A 

348. 

(N71  (A-)  X7  (R-)) 

334 

(N7  nil  nil  X7) 

B 

340. 

(S  0 M 0) 

361 

(M?  * nil  M) 

B 

360. 

(S  ()  M (R-)) 

362 

(M7  * nil  M) 

B 

361. 

(S  (♦)  M7  0) 

600 

(X  R4-  S) 

A 

362. 

(S  (♦)  M7  (R-)) 

361 

(M?  nil  R-  M7) 

B 

363. 

(S  (♦  A-)  M72  0) 

602 

(X  R*  * S) 

A 

364. 

(S  (♦  A-)  M72  (R-)) 

363 

(M72  nil  R-  M72) 

B 

366. 

(S  (♦  A-  R-)  M72  0) 

(X  Rf  * S) 

A 

366. 

(S  (•»  A-  R-)  M72  (R-)) 

366 

(M72  nil  R-  M72) 

B 

367. 

(S  0 M7  (R-  -)) 

368 

(N7  nil  R-  N7) 

B 

368. 

(S  0 N7  (-)) 

384 

(X?  nil  nil  N7) 

B 

360. 

(S  0 M?  (R-  - R-)) 

360 

(N?  nil  R-  N7) 

B 

360. 

(S  0 M7  (-  R-)) 

386 

(X7  nil  nM  N7) 

B 

160. 

Appendix  A: 

Correotneas 

of  the  Membership  Pratoooi 

Mmmbership  Protocol  Network  States 


Transition  ApftHeti 


Number  State 

Previous 

Rule 

361. 

(S  0 S (♦)) 

349 

(M  nil  * S) 

362. 

(S  0 S {♦  R-)) 

350 

(M  nil  4 S) 

363. 

(S  (♦)  S 0) 

1 

(M  nil  4 S) 

364. 

(S  (♦  R-)  S 0) 

2 

(M  nil  4 S) 

365. 

(S  (+  A+)  S?  0) 

4 

(M  nil  4 S) 

366. 

(S  (♦  R-  A+)  S7  0) 

3 

(M  nil  4 S) 

367. 

(S  (♦  A+  R-)  S?  0) 

6 

(M  nil  4 S) 

368. 

(S  (♦  R-  A+  R-)  S?  0) 

6 

(M  nil  4 S) 

369. 

(S  0 SR  (R-  ♦)) 

361 

(S  nil  R-  SR) 

370. 

(S  0 SR  (R-  ♦ R-)) 

362 

(S  nil  R-  SR) 

371. 

(S  (+)  SR  0) 

372 

(S  R-  nil  S) 

372. 

(S  (♦)  SR  (R-)) 

363 

(S  nil  R-  SR) 

373. 

(S  (♦  R-)  SR  0) 

374 

(S  R-  nil  S) 

374. 

(S  (♦  R-)  SR  (R-)) 

364 

(S  nil  R-  SR) 

376. 

(S  (♦  A+)  SR?  0) 

fS  R-  nil  S) 

376. 

(S  (♦  A+)  SR?  (R-)) 

see 

(S?  nil  R-  SR?) 

377. 

(S  (♦  R-  A+)  SR?  0) 

STS 

(S  R-  nil  S) 

378. 

(S  (♦  R-  A+)  SR?  (R-)) 

366 

(S?  nil  R-  SR?) 

379. 

(S  (♦  A+  R-)  SR?  0) 

380 

(S  R-  nil  S) 

380. 

(S  (♦  A+  R-)  SR?  (R-)) 

367 

(S?  nil  R-  SR?) 

381. 

(S  (♦  R-  A+  R-)  SR?  0) 

382 

(S  R-  nil  S) 

382. 

(S  (♦  R-  A+  R-)  SR?  (R-)) 

368 

(S?  nil  R-  SR?) 

383. 

(S  0 X?  (R-  -)) 

367 

(N?  nil  nil  X?) 

384. 

(S  0 X?  (-)) 

349 

(M  nil  - X?) 

386. 

(S  0 X?  (R-  - R-)) 

360 

(N?  nil  nil  X?) 

386. 

(S  0 X?  (-  R-)) 

360 

(M  nil  - X?) 

387. 

(S?  0 M (R-  A+)) 

see 

(M  nil  R-  M) 

388. 

(S?  0 M (A+)) 

417 

(X!  A4  A4  M) 

389. 

(S?  0 M (R-  A+  R-)) 

300 

(M  nil  R-  M) 

300. 

(S?  0 M (A*  R-)) 

418 

(X!  A4  A4  M) 

301. 

(S?  0 M?2  (R-  R+  A-)) 

302 

(M?2  nil  R-  M?2^ 

302. 

(S?  0 M?2  (R+  A-)) 

407 

(N?1  nil  R4  M72) 

303. 

(S?  0 M?2  (R-  R+  A-  R-)) 

304 

(M?2  nil  R-  M?2) 

304. 

(S?  0 M?2  (R+  A-  R-)) 

408 

(N?1  nil  R4  M?2) 

306. 

(S?  0 N?  (R-  - A+)) 

306 

(N?  nil  R-  N?) 

306. 

(S?  0 N?  (-  A*)) 

422 

(X?  nil  nil  N?) 

307. 

(S?  0 N?  (R-  - R-  A+)) 

398 

(N?  nil  R-  N?) 

308. 

(S?  0 N?  (-  R-  A+)) 

424 

(X?  nil  nil  N?) 

309. 

(S?  0 N?  (R-  - A+  R-)) 

400 

(N?  nil  R-  N?) 

400. 

(S?  O N?  (-  A+  R-)) 

426 

(X?  nil  nil  N?) 

401. 

(S?  0 N?  (R-  - R-  A+  R-)) 

402 

(N?  nil  R-  N?) 

402. 

(S?  0 N?  (-  R-  A+  R-)) 

428 

(X?  nil  nil  N?) 

403. 

(S?  (A+)  N?  0) 

406 

(N?  R-  nil  N?) 

404. 

(S?  (A*)  N?  (R-)) 

406 

(N?  R-  nil  N?) 

406. 

(S?  (A4  R-)  N?  0) 

431 

(X?  nil  nN  N?) 

\ 


Sectkm  A.  2:  Tasting  the  Membership  Protocol 


161. 


B a<  < < 


M%mbmrshlp  Protocol  Network  States 


Transition  Applied 


Number 

State 

Previous 

Rule 

To 

406. 

(S?  (A+  R-)  N?  (R-)) 

AOb 

(N?  nil  M N ,*) 

B 

407. 

(S?  0 N71  (A-)) 

403 

(N7  A+  A-  N71) 

B 

408. 

(S?  0 N?1  (A-  R-)) 

404 

(N7  A+  A-  N71) 

B 

400. 

(S7  0 S (♦  A+)) 

388 

(M  nil  * S) 

B 

410. 

(S7  0 S (♦  R-  A+)) 

387 

(M  nil  * S) 

B 

411. 

(S7  0 S (+  A+  R-)) 

300 

(M  nil  S) 

B 

412. 

(S7  0 S (♦  R-  A+  R-)) 

380 

(M  nil  + S) 

B 

413. 

(S7  0 SR  (R-  + A+)) 

400 

(S  nil  R-  SR) 

B 

414. 

(S7  0 SR  (R-  ♦ R-  A+)) 

410 

(S  nil  R-  SR) 

B 

415. 

(S7  0 SR  (R-  ♦ A+  R-)) 

411 

(S  nil  R-  SR) 

B 

416. 

(S7  ()  SR  (R-  ♦ R-  A*  R-)) 

412 

(S  nil  R-  SR) 

B 

417. 

(S7  (A4)  X!  0) 

431 

(X7  R-  nil  X!) 

B 

418. 

(S7  (A+)  X!  (R-)) 

417 

(X!  nil  R-  X!) 

B 

410. 

(S7  (A*  R-)  X!  0) 

431 

(X7  R-  nil  XI) 

B 

420. 

(S7  (A*  R-)  XI  (R-)) 

410 

(XI  nil  R-  X!) 

B 

421. 

(S7  0 X7  (R-  - A+)) 

306 

(N7  nil  nil  X7) 

B 

422. 

(S7  0 X7  (-  A+)) 

388 

(M  nil  - X7) 

B 

423. 

(S7  0 X7  (R-  - R-  A+)) 

307 

(N7  nil  nil  X7) 

B 

424. 

(S7  0 X7  (-  R-  A+)) 

387 

(M  nil  - X7) 

B 

426. 

300 

(N7  nil  nil  X?) 

B 

426. 

(S7  0 X7  (-  A+  R-)) 

300 

(M  nil  - X7) 

B 

427. 

(S?  0 X7  (R-  - R-  A*  R-)) 

401 

(R7  nil  nil  X7) 

B 

428. 

(S7  0 X7  (-  R-  A+  R-)) 

380 

(M  nil  - X7) 

B 

420. 

(S7  (A+)  X7  0) 

486 

(SR  - A+  S7) 

A 

430. 

(S7  (A+)  X7  (R-)) 

404 

(N7  nil  nil  X7) 

B 

431. 

(S7  (A+  R-)  X7  (i) 

400 

(SR  - A+  S7) 

A 

432. 

(S7  (A+  R-)  X7  (R-)) 

406 

(N7  nil  nil  X7) 

B 

433. 

(SR  0 M 0) 

435 

(M  R-  nfi  M) 

B 

434. 

(SR  0 M (R-)) 

436 

(M  R-  nil  M) 

B 

(SR  (R-)  M 0) 

437 

(M7  nil  M) 

B 

436. 

(SR  (R-)  M (R-)) 

438 

(M7  nil  M) 

B 

437. 

(SR  (R-  ♦)  M7  0) 

361 

(S  nil  R-  SR) 

A 

438. 

(SR  (R-  ♦)  M7  (R-)) 

437 

(M7  nil  R-  M7) 

B 

430. 

(SR  (R-  4^  A-)  M72  0) 

363 

(S  nil  R-  SR) 

A 

440. 

(SR  (R-  ♦ A-)  M72  (R-)) 

430 

(M72  nil  R-  M72) 

B 

441. 

(SR  (R-  ♦ A-  R-)  M72  ()) 

366 

(S  nil  R-  SR) 

A 

442. 

(SR  (R-  f A-  R-)  M72  (R-)) 

441 

(M72  nil  R-  M72) 

B 

443. 

(SR  0 N7  (R-  -)) 

447 

(N7  R-  nil  N7) 

B 

444. 

(SR  0 N7  (-)) 

448 

(N7  R-  nil  N7) 

B 

446. 

(SR  0 N7  (R-  - R-)) 

440 

(N7  R-  nil  N7) 

B 

440. 

(SR  0 N7  (-  R-)) 

460 

(N7  R-  nil  N7) 

B 

447. 

(SR  (R-)  M7  (R-  -)) 

448 

(N7  nil  R-  N7) 

B 

448. 

(SR  (R-)  N7  (-)) 

400 

(X7  nil  nil  N7) 

B 

440. 

(SR  (R-)  N7  (R-  - R-)) 

460 

(N7  nil  R-  N7) 

B 

460. 

(SR  (R-)  M7  (-  R-)) 

402 

(X7  nil  nil  N7) 

B 

102. 


Appendix  A:  Corr«ctna«s  of  the  Membership  Protocol 


Mmmbershlp  Protocol  Network  States 


Transition 

Applied 

Number  Stete 

Previous 

Rule 

To 

461. 

(SR  0 S (+)) 

466 

(S  R-  nil  S) 

B 

452. 

(SR  0 S (♦  R-)) 

466 

(S  R-  nil  S) 

B 

453. 

(SR  (R-  +)  S 0) 

363 

(S  nil  R-  SR) 

A 

464. 

(SR  (R-  ♦ R-)  S 0) 

364 

(S  nil  R-  SR) 

A 

466. 

(SR  (R-)  S (+)) 

436 

(M  nil  * S) 

B 

456. 

(SR  (R-)  S (+  R-)) 

436 

(M  nil  S) 

B 

467. 

(SR  (R-  + A+)  S?  0) 

366 

(S  nil  R>  SR) 

A 

468. 

(SR  (R-  + R-  A+)  S7  0) 

366 

(S  nil  R-  SR) 

A 

459. 

(SR  (R-  ♦ A+  R-)  S?  0) 

367 

(S  nil  R-  SR) 

A 

460. 

(SR  (R-  ♦ R-  A+  R-)  S?  0) 

368 

(S  nil  R>  SR) 

A 

461. 

(SR  0 SR  (R-  +)) 

467 

(SR  R-  nil  SR) 

B 

462. 

(SR  0 SR  (R-  + R-)) 

468 

(SR  R-  nil  SR) 

B 

463. 

(SR  (R-  +)  SR  0) 

464 

(SR  R-  nil  SR) 

A 

464. 

(SR  (R-  +)  SR  (R-)) 

463 

(S  nil  R'  SR) 

B 

466. 

(SR  (R-  + R-)  SR  0) 

466 

(SR  R-  nil  SR) 

A 

466. 

(SR  (R-  + R-)  SR  (R-)) 

464 

(S  nil  R-  SR) 

B 

467. 

(SR  (R-)  SR  (R-  ♦)) 

466 

(S  nil  R'  SR) 

B 

468. 

(SR  (R-)  SR  (R-  ♦ R-)) 

466 

(S  nil  R~  SR) 

B 

469. 

(SR  (R-  ♦ A+)  SR?  0) 

470 

(SR  R-  nU  SR) 

A 

470. 

(SR  (R-  ♦ A*)  SR?  (R-)) 

467 

(S?  nil  R-  SR?) 

B 

471. 

(SR  (R-  ♦ R-  A+)  SR?  0) 

472 

(SR  R-  nil  SR) 

A 

472. 

(SR  (R-  ♦ R-  A+)  SR?  (R-)) 

468 

(S?  nil  R-  SR?) 

B 

473. 

(SR  (R-  + A+  R-)  SR?  0) 

474 

(SR  R-  nil  SR) 

A 

474. 

(SR  (R-  ♦ A+  R-)  SR?  (R-)) 

469 

(S?  nil  R-  SR?) 

B 

476. 

(SR  (R-  + R-  A+  R-)  SR?  ()) 

476 

(SR  R-  nil  SR) 

A 

476. 

(SR  (R-  ♦ R-  A+  R-)  SR?  (R-)) 

460 

(S?  nil  R-  SR?) 

B 

477. 

(SR  0 X!  (R-  -)) 

478 

(X!  nil  R-  X!) 

B 

478. 

(SR  0 X!  (-)) 

490 

(X?  R-  nil  X!) 

B 

479. 

(SR  0 X!  (R-  - R-)) 

480 

(X!  nil  R-  X!) 

B 

480. 

(SR  0 X!  (-  R-)) 

402 

(X?  R-  nil  X!) 

B 

481. 

(SR  (R-)  X!  (R-  -)) 

482 

(X!  nil  R-  XI) 

B 

482. 

(SR  (R-)  X!  (-)) 

490 

(X?  R-  nil  X!) 

B 

483. 

(SR  (R-)  X!  (R-  - R-)) 

484 

(X!  nil  R-  X!) 

B 

484. 

(SR  (R-)  X!  (-  R-)) 

402 

(X?  R-  nil  X!) 

B 

486. 

(SR  0 X?  (R-  -)) 

443 

(N?  nil  nil  X?) 

B 

486. 

(SR  0 X?  (-)) 

433 

(M  nil  - X?) 

B 

487. 

(SR  0 X?  (R-  - R-)) 

446 

(N?  nil  rll  X7) 

B 

488. 

(SR  0 X?  (-  R-)) 

434 

(M  nil  - X?) 

B 

489. 

(SR  (R-)  X?  (R-  -)) 

447 

(N?  nil  nil  X7) 

B 

490. 

(SR  (R-)  X?  (-)) 

436 

(M  nil  - X?) 

B 

491. 

(SR  (R-)  X?  (R-  - R-)) 

449 

(N?  nil  nil  X?) 

B 

492. 

(SR  (R-)  X?  (-  R-)) 

436 

(M  nil  - X?) 

B 

493. 

(SR?  0 M (R-  A*)) 

407 

(M  R-  nil  M) 

B 

494. 

(SR?  0 M (A*)) 

498 

(M  R-  nil  M) 

B 

496. 

(SR?  0 M (R-  A*  R-)) 

490 

(M  R-  nil  M) 

B 

Section 

A.2:  Testing  the  Membership  Protocol 

163. 

Mmmbership  Protocoi  NetyimrK  States 


Number 


Previous 


(SR?  0 M (A+  R-)) 

(SR?  (R-)  M (R-  A+)) 
(SR?  (R-)  M (A+)) 

(SR?  (R-)  M (R-  A+  R-)) 


Transition 

Rule 

(M  R-  nil  M) 
(M  nil  R-  M) 
(X!  A-i-  A4-  M) 
(M  nil  R-  M) 
(X!  A*  A*  M) 


Applied 

To 


501. 

(SR?  0 M?2  (R-  R+  A-)) 

505 

(M?2  R-  nil  M?2) 

502. 

(SR?  0 M?2  (R+  A-)) 

506 

(M?2  R-  nil  M?2) 

503. 

(SR?  0 M?2  (R-  R+  A-  R-)) 

507 

(M?2  R-  nil  M?2) 

504. 

(SR?  0 M?2  (R+  A-  R-)) 

508 

(M?2  R-  nil  M?2) 

505. 

(SR?  (R-)  M?2  (R-  R+  A-)) 

506 

(M?2  nil  R-  M?2) 

506. 

(SR?  (R-)  M?2  (R+  A-)) 

531 

(N?1  nil  R+  M?2) 

507. 

(SR?  (R-)  M?2  (R-  R+  A-  R-)) 

508 

(M?2  nil  R-  M?2) 

508. 

(SR?  (R-)  M?2  (R+  A-  R-)) 

532 

(N?1  nil  R+  M?2) 

500. 

(SR?  0 N?  (R-  - A+)) 

521 

(N?  R-  nil  N?) 

510. 

(SR?  0 N?  (-  A+)) 

522 

(N?  R-  nil  N?) 

511. 

(SR?  0 N?  (R-  - R-  A+)) 

523 

(N?  R-  nil  N?) 

512. 

(SR?  0 N?  (-  R-  A+)) 

524 

(N?  R-  nil  N?) 

513. 

(SR?  0 N?  (R-  - A+  R-)) 

525 

(N?  R-  nil  N?) 

514. 

(SR?  0 N?  (-  A+  R-)) 

526 

(N?  R-  nil  N?) 

515. 

(SR?  0 N?  (R-  - R-  A+  R-)) 

527 

(N?  R-  nil  N?) 

516. 

(SR?  0 N?  (-  R-  A+  R-)) 

528 

(N?  R-  nil  N?) 

517. 

(SR?  (R-  A+)  N?  0) 

510 

(N?  R-  nil  N?) 

518. 

(SR?  (R-  A+)  N?  (R-)) 

520 

(N?  R-  nil  N?) 

510. 

(SR?  (R-  A+  R-)  N?  0) 

570 

(X?  nil  nil  N?) 

520. 

(SR?  (R-  A+  R-)  N?  (R-)) 

510 

(N?  nil  R-  N?) 

521. 

(SR?  (R-)  N?  (R-  - A+)) 

522 

(N?  nil  R-  N?) 

522. 

(SR?  (R-)  M?  (-  A+)) 

582 

(X?  nil  nil  N?) 

523. 

(SR?  (R-)  N?  (R-  - R-  A+)) 

524 

(N?  nil  R-  N?) 

524. 

(SR?  (R-)  N?  (-  R-  A+)) 

584 

(X?  nil  nil  N?) 

525. 

(SR?  (R-)  N?  (R-  - A+  R-)) 

526 

(N?  nil  R-  N?) 

536. 

(SR?  0 S (4  R-  A4  R-)) 

640 

(S  R-  nil  S) 

B 

637. 

(SR?  (R-)  S (4  A4)) 

408 

(M  nil  4 S) 

B 

638. 

(SR?  (R-)  S (4  R-  A4)) 

407 

(M  nil  4 S) 

B 

630. 

(SR?  (R-)  S (4  A4  R-)) 

600 

(M  nil  4 S) 

B 

640. 

(SR?  (R-)  S (4  R-  A4  R-)) 

400 

(M  nil  4 S) 

B 

Appendix  A:  Correotnass  of  the  Membership  Protocol 


ca  a B 0 a 


r 


Membership  Protocol  Network  States 


Transition 

AppHed 

Numoer  State 

Previous 

Rule 

To 

641. 

(SR?  0 SR  (R-  A+)) 

646 

(SR  R-  nil  SR) 

6 

642. 

(SR?  0 SR  (R-  + R-  A*)) 

646 

(SR  R-  nil  SR) 

B 

643. 

(SR?  0 SR  (R-  ♦ A+  R-)) 

647 

(SR  R-  nil  SR) 

6 

644. 

(SR?  0 SR  (R-  ♦ R-  A*  R-)) 

648 

(SR  R-  nil  SR) 

B 

646. 

(SR?  (R-)  SR  (R-  ♦ A*)) 

637 

(S  nil  R-  SR) 

B 

646. 

(SR?  (R-)  SR  (R-  + R-  A+)) 

638 

(S  nil  R-  SR) 

B 

647. 

(SR?  (R-)  SR  (R-  ♦ A+  R-)) 

639 

(S  nil  R-  SR) 

B 

648. 

(SR?  (R-)  SR  (R-  ♦ R-  A+  R-)) 

640 

(S  nil  R-  SR) 

B 

649. 

(SR?  0 X!  (R-  - A+)) 

660 

(X!  nil  R-  X!) 

B 

660. 

582 

(X?  R-  nil  X!) 

B 

661. 

(SR?  (J  X!  (R-  - R-  A*)) 

662 

(X!  nil  R-  Xi) 

B 

662. 

(SR?  0 XI  (-  R-  A+)) 

684 

(X?  R-  nil  XI) 

B 

653. 

(SR?  0 X!  (R-  - A+  R-)) 

664 

(X!  nil  R-  X!) 

B 

664. 

(SR?  0 X!  (-  A+  R-)) 

686 

(X7  R-  nil  X!) 

B 

666. 

(X!  nil  R-  X!) 

B 

658. 

(SR7  (J  X!  (-  R-  A+  R-)) 

688 

(X?  R-  nil  X!) 

B 

667. 

(SR?  (R-  A*)  X!  0) 

679 

(X?  R-  nil  X!) 

B 

668. 

(SR?  (R-  A+)  X!  (R-)) 

667 

(X!  nil  R-  XI) 

B 

569. 

(SR?  (R-  A*  R-)  XJ  0) 

679 

(X?  R-  nit  X!) 

B 

680. 

(SR?  (R-  A+  R-)  X!  (R-)) 

659 

(X!  nil  R-  X!) 

B 

661. 

(SR?  (R-)  X!  (R-  - A+)) 

662 

(X!  nil  R-  X!) 

B 

662. 

(SR?  (R-)  X!  (-  A+)) 

682 

(X?  R-  nil  X!) 

B 

663. 

(SR?  (R-)  X!  (R-  - R-  A+)) 

664 

(X!  nil  R-  X!) 

B 

564. 

(SR?  (R-)  X!  (-  R-  A*)) 

684 

(X?  R-  nil  Xf) 

B 

686. 

666 

(X!  nil  R-  X!) 

B 

666. 

(SR?  (R-)  X!  (-  A+  R-)) 

686 

(X?  R-  nil  XI) 

B 

667. 

(SR?  (R-)  X!  (R-  - R-  A+  R-)) 

668 

(XI  nil  R-  X!) 

B 

688. 

(SR?  (R-)  X!  (-  R-  A+  R-)) 

688 

(X?  R-  nil  XI) 

B 

669. 

(SR?  0 X?  (R-  - A+)) 

609 

(N?  nil  nil  X?) 

B 

670. 

(M  nil  - X7) 

B 

671. 

(SR?  (;  X?  (R-  - R-  A+)) 

611 

(N?  nil  nil  X7) 

B 

572. 

(SR?  0 X?  (-  R-  A+)) 

493 

(M  nil  - X?) 

B 

673. 

(SR?  0 X?  (R-  - A+  R-)) 

613 

(N?  nil  nil  X?) 

B 

674. 

(SR?  0 X?  (-  A+  R-)) 

496 

(M  nil  - X7) 

B 

676. 

(SR?  0 X7  (R-  - R-  A+  R-)) 

616 

(N?  nil  nil  X7) 

B 

676. 

(SR?  0 X?  (-  R-  A+  R-)) 

496 

(M  nil  - X?) 

B 

677. 

(SR?  (R-  A+)  X?  0) 

429 

(S?  nil  R-  SR?) 

A 

678. 

(SR?  (R-  A+)  X?  (R-)) 

618 

(N?  nil  nil  X?) 

B 

679. 

(SR?  (R-  A+  R-)  X?  0) 

431 

(S?  nil  R-  SR?) 

A 

680. 

(SR?  (R-  A+  R-)  X?  (R-)) 

(N?  nil  nil  X?) 

B 

681. 

(SR?  (R-)  X?  (R-  - A*)) 

621 

(N7  nil  nil  X?) 

B 

682. 

(SR?  (R-)  X?  (-  A+)) 

498 

(M  nil  - X7) 

B 

683. 

(SR?  (R-)  X7  (R-  - R-  A*)) 

623 

(N?  nH  nil  X7) 

B 

684. 

(SR?  (R-)  X?  (-  R-  A+)) 

497 

(M  nil  - X7) 

B 

686. 

(SR?  (R-)  X?  (R-  - A+  R-)) 

626 

(N?  nil  nil  X7) 

B 

Section  A.2:  TMtIng  the  Membership  Protocol 


106. 


I 

I 


Membership  Protocol  Network  States 


Transition 

Applied 

Number 

State 

Previous 

Rule 

To 

680. 

(SR?  (R-)  X?  (-  A+  R-)) 

600 

(M  nil  - X?) 

B 

687. 

(SR?  (R-)  X?  (R-  - R-  A+  R-)) 

627 

(N?  nil  nil  X?) 

B 

688. 

(SR?  (R-)  X?  (-  R-  A+  R-)) 

499 

(M  nil  - X?) 

B 

688. 

(X  0 M?  (R-  R+)) 

690 

(M?  nil  R-  M?) 

B 

690. 

(X  0 M?  (R+)) 

696 

(N  nil  R-i-  M?) 

B 

691. 

(X  (A-)  M?2  (R-  R*)) 

692 

(M?2  nil  R-  M?2) 

B 

682. 

(X  (A-)  M?2  (R+)) 

696 

(N?1  nil  R-i-  M?2) 

B 

693. 

(X  (A-  R-)  M?2  (R~  R+)) 

684 

(M?2  nil  R-  M?2) 

B 

694. 

(X  (A-  R-)  M?2  (R+)) 

697 

(N?1  nil  R-t-  M?2) 

B 

696. 

(X  0 N 0) 

608 

(X  nil  nil  N) 

B 

696. 

(X  (A-)  N?1^)) 

262 

(N  nil  nil  X) 

A 

687. 

(X  (A-  R-)  N?1  0) 

263 

(N  nil  nil  X) 

A 

698. 

(X  0 X 0) 

Initial  State 

699. 

(X!  0 S?  (A+)) 

606 

(SR  - A-i-  S?) 

B 

600. 

(X!  0 S?  (A+  R-)) 

606 

(SR  - A+  S?) 

B 

601. 

(X!  (R-)  S?  (A+)) 

603 

(SR  - A■^  S7) 

B 

602. 

(X!  (R-)  S?  (A+  R-)) 

604 

(SR  - A-i^  S?) 

B 

603. 

(X!  (R-  -)  SR  0) 

606 

(X!  nil  R-  XI) 

A 

604. 

(X!  (R-  -)  SR  (R-)) 

606 

(XI  nil  R-  XI) 

A 

606. 

(X!  (-)  SR  0) 

(X?  R-  nil  XI) 

A 

606. 

(X!  (-)  SR  (R-)) 

656 

(X?  R-  n«  XO 

A 

607. 

(XI  (R-  - R-)  SR  0) 

609 

(XI  nil  R-  XI) 

A 

608. 

(X!  (R-  - R-)  SR  (R-)) 

610 

(XI  nil  R-  XI) 

A 

609. 

(X!  (-  R-)  SR  0) 

660 

(X?  R-  nil  X!) 

A 

610. 

(X!  (-  R-)  SR  (R-)) 

660 

(X?  R-  nil  XI) 

A 

611. 

(XI  0 SR?  (R-  A«}) 

699 

(S?  nil  R-  SR?) 

B 

612. 

(XI  0 SR?  (R-  A*  R-)) 

600 

(S?  nil  R-  SR?) 

B 

613. 

(X!  (R-  - A+)  SR?  0) 

616 

(X!  nil  R-  X!) 

A 

614. 

(XI  (R-  - A+)  SR?  (R-)) 

616 

(X!  nil  R-  XI) 

A 

616. 

(XI  (-  A*)  SR?  0) 

666 

(X?  R-  nil  X!) 

A 

616. 

(X!  (-  A+)  SR?  (R-)) 

666 

(X?  R-  nil  X!) 

A 

617. 

(XI  (R-  - R-  A+)  SR?  0) 

619 

(X!  nil  R-  X!) 

A 

618. 

(XI  (R-  - R-  A+)  SR?  (R-)) 

620 

(Xl  nil  R-  X!) 

A 

619. 

(XI  (-  R-  A+)  SR?  0) 

670 

(X?  R-  nil  XI) 

A 

620. 

(XI  (-  R-  A+)  SR?  (R-)) 

670 

(X?  R-  nil  XI) 

A 

621. 

(XI  iR-  - A+  R-)  SR?  0) 

623 

(XI  nil  R-  XI) 

A 

622. 

(XI  (R-  - A*  R-)  SR?  (R-)) 

624 

(XI  nil  R-  Xl) 

A 

623. 

(X!  (-  A*  R-)  SR?  0) 

674 

(X?  R-  nil  XI) 

A 

624. 

(XI  (-  A+  R-)  SR?  (R-)) 

674 

(X?  R-  nil  XI) 

A 

626. 

(XI  (R-  - R-  A+  R-)  SR?  O) 

(XI  nil  R-  XI) 

A 

626. 

(XI  (R-  - R-  A*  R-)  SR?  (R-)) 

626 

^Xl  nil  R-  Xl) 

A 

627. 

(XI  (-  R-  A+  R-)  8fl?  0) 

676 

(X?  R-  nil  XI) 

A 

628. 

(X!  (-  R-  A+  R-)  SR?  (R-)) 

678 

(X?  R-  nil  XI) 

A 

629. 

(XI  (R-)  SR?  (R*  A*)) 

601 

(S?  nil  R-  SR?) 

B 

630. 

(XI  (R-)  SR?  (R-  A*  R-)) 

602 

(S?  nil  R-  SR?) 

B 

166. 

Appendix  A: 

CorrsotfMM 

of  th«  Mambprshlp 

Protocol 

Mmmbership  Protocol  Network  States 


Number 


Previous 


(X?  0 M72  (R-  R+  A-)) 
(X?  0 M?2  (R+  A-)) 

(X?  (R-)  M?2  (R-  R+  A-)) 
(X7  (R-)  M72  (R+  A-)) 
(X7  0 W71  (A-) 


(X7  (R-)  N71  (A-)) 

(X7  (R-  -)  S 0) 

(X7  (-)  S 0) 

(X7  (R-  - R-)  S 0) 

(X7  (-  R-)  S ()) 


(X7  0 S7  (A+)) 

(X7  0 S7  (A+  R-)) 

(X7  (R-  - A+)  S7  0) 
(X7  (-  A+)  S7  0) 

X7  (R-  - R-  A+)  S7  0) 


(X7  (-  R-  A+)  S7  0) 

(X7  (R-  - A+  R-)  S7  0) 
(X7  (-  A+  R-)  S7  0) 

(X7  (R-  - R-  A+  R-)  S7  ()) 
- R-  A+  R-)  S7  0) 


(X7  (R-)  S7  (A+)) 
(X7  (R-)  S7  (A+  R-)) 
(X7  (R-  -)  SR  0) 

(X7  (R-  -)  SR  (R-)) 
fx?  SR 


(X7  (-)  SR  (R-)) 

(X7  (R-  - R-)  SR  0) 
(X7  (R-  - R-)  SR  (R-)) 
(X7  (-  R-)  SR  0) 

X7  (-  R-)  SR  (R-)) 


Transition 

Rule 

(M72  nil  R-  M72) 
(N71  nil  R-t-  M72) 
(M72  nil  R-  M72) 
(N71  nil  R-i-  M72) 
is  - A-  W71) 


(S  - A-  N71) 
(N7  nil  nil  X7) 
(M  nil  - X7) 
(N7  nil  nil  X7) 
(M  nil  - X7) 


(SR  - A+  S7) 
(SR  - A+  S7) 
(N7  nil  nil  X7) 
(M  nil  - X7) 
N7  nil  nil  X7 


(M  nil  - X7) 
(N7  nil  nil  X7) 
(M  nil  - X7) 
(N7  nil  nil  X7) 
(M  nil  - X7) 


(SR  - A+  S7) 
(SR  - A+  S7) 
(N7  nil  nil  X7) 
(S  nil  R-  SR) 
M nil  - X7 


(S  nil  R-  SR) 
(N7  nil  nil  X7) 
(S  nil  R-  SR) 
(M  nil  - X7) 
(S  nil  R-  SR) 


Applied 

To 


671. 

(X7  (R-  - A+  R-)  SR7  ()) 

206 

(N7  nil  nil  X7) 

A 

672. 

(X7  (R-  - A+  R-)  SR7  (R-)) 

647 

(S7  nH  R-  SR7) 

0 

673. 

(X7  (-  A+  R-)  SR7  0) 

17 

(M  nH  - X7) 

A 

674. 

(X7  (-  A*  R-)  SR7  (R-)) 

646 

(S?  nN  R-  SR?) 

B 

676. 

(X7  (R-  - R-  k*  R-)  SR7  ()) 

200 

(N7  nH  nil  X7) 

A 

Section  A.2:  TMting  th«  Membarship  Protocol 


Membership  Protocol  Network  States 


Transition  Applied 


Number 

State 

Previous 

Rule 

To 

676. 

(X?  (R-  - R-  A*  R-)  SR?  (R-)) 

649 

(S?  nil  R-  SR?) 

B 

677. 

(X?  (-  R-  A+  R-)  SR?  0) 

15 

(M  nil  - X?) 

A 

678. 

(X?  (-  R-  A+  R-)  SR?  (R-)) 

660 

(S?  nil  R-  SR?) 

B 

679. 

(X?  (R-)  SR?  (R-  A+)) 

661 

(S?  nil  R-  SR?) 

B 

680. 

(X?  (R-)  SR?  (R-  A+  R-)) 

662 

(S?  nil  R-  SR?) 

B 

Th«  stable  states  In  this  enumeration  are 

(M  0 S 0)  (M  0 SR  0)  (X  0 X 0)  (X  0 N 0) 

(S  0 M 0)  (SR  0 M 0)  (N  0 N 0)  (N  ()  X ()) 

Since  the  enumeration  is  exhaustive,  all  states  reachable  from  these  states  have 
been  included;  thus  adding  any  of  these  (or  any  of  the  others  listed  above)  to  our 
set  of  initial  states  would  not  have  produced  any  additional  output  Note  that  all 
stable  network  states  we  would  desire  (and  none  that  we  wouldn't)  are 
represented,  satisfying  the  consistency  criterion  for  our  protocol.  Also,  the  closure 
requirement  for  our  protocol  is  satisfied,  since  no  processor  ever  received  a mes- 
sage it  did  not  have  a state  transition  rule  for. 

Only  after  being  assured  of  consistency  and  closure  does  It  make  sense  to 
talk  about  other  properties  of  the  protocol,  such  as  resistance  to  disconnection  and 
resistance  to  forming  cycles.  These  properties,  unfortunately,  bring  Into  play  other 
attributes  of  the  network  not  directly  modeled  In  our  abstraction.  As  a result,  it  is 
not  obvious  how  to  devise  a test  similar  to  the  above  which  will  check  for  them. 
We  can,  however,  recap  our  reasons  for  believing  that  our  membership  protocol 
actually  exMbIts  these  desired  properties. 

Resistance  to  dlseonnectlon  depends  on  two  properties  of  our  protocol.  The 
first,  not  directly  expressed  in  the  state  transitions,  Is  that  only  a loaf  node  may 


lea. 


Appertdix  A:  Correctness  of  the  Membership  Protocol 


•ttMipt  to  roinove  Itself  from  a tree.  The  second  Is  the  state  transition  sequence 
that  a processor  must  follow  In  order  to  break  a link.  The  processor  breaking  the 
Hnk  must  have  started  out  as  master  of  the  link;  this  guarantees  that  at  most  one 
of  the  processors  at  the  ends  of  a link  will  be  attempting  to  break  the  link  at  one 
tkne.  When  the  attempt  Is  made,  the  master  will  change  to  state  XT.  The  con- 
siderations surrounding  state  X?  which  have  already  been  discussed  ensure  that 
the  link  will  only  be  broken  If  (1)  the  processor  In  state  XT  receives  no  references 
over  that  link  during  the  period  that  the  break  Is  being  confirmed,  or  (2)  the  pro- 
cessor In  state  XT  rejoins  the  tree  via  some  other  neighbor  before  the  break  has 
been  confirmed  (this  will  cause  a state  transition  to  NT  for  the  link  being  broken; 
this  transition  will  force  that  link  to  be  broken,  preventing  a cycle  from  forming). 

Resistance  to  forming  cycles  Is  embodied  In  the  property  of  states  N and  MT 
to  turn  away  R4  messages  seeking  to  establish  new  links.  R4  messages  are  the 
only  way  new  links  can  be  formed,  and  only  processors  In  state  X (not  currently 
part  of  the  reference  tree)  will  accept  and  act  positively  on  them. 


Section  A.  2:  Testing  the  Membership  Protocol 


ISO. 


REFERENCES 


1.  Baker,  H.,  "List  Processing  In  Real  Time  on  a Serial  Computer."  A.I.  Working 
Paper  13S,  Artificial  Intelligence  Laboratory,  M.I.T.,  February  1977. 

2.  Baker,  H.,  and  Hewitt,  C.,  "The  Incremental  Garbage  Collection  of 
Processes,"  ACM  SHSART-SiOPLAN  Symposium,  Rochester,  N.Y.,  August  1 977. 

3.  Barnes,  G.H.,  et  al.,  "The  llliac  IV  Computer,"  IEEE  Transactions  C~T7,  Vol.  8, 
August  1908. 

4.  Bishop,  P.,  Computmr  Systems  with  a I/ary  Larga  Addrass  Space  and  Qarbaga 
Collactlon,  LCS  TR-178,  Laboratory  for  Computer  Science,  M.I.T.,  May  1977. 

6.  Church,  A.,  "The  Calculi  of  Lambda  Conversion,"  Annals  of  Mathematics  Stu- 
dies, Princeton  University  Press,  1941. 

0.  Curry,  H.B.,  and  Fays,  R..  Combinatory  Logic,  Amsterdam,  1 968. 

7.  Dijkstra,  E.W.,  "The  Structure  of  the  "THE"  Multiprogramming  System,"  Com- 
munications of  the  ACM,  May  1 968. 

8.  Farber,  DJ.,  "A  Ring  Network."  Datamation,  February  1976. 

9.  Farber,  DJ.,  and  Heinrich,  F.R.,  "The  Structure  of  a Distributed  Computer 
System:  The  Distributed  FHe  System,"  Proceedings  of  the  First  International 
Conference  on  Computer  Communications,  1 972. 

10.  Farber,  DJ.,  et  al.,  "The  Distributed  Computing  System,"  Proceedings  of  the 
Seventh  Annuel  IEEE  Computer  Society  International  Conference,  February 
1973. 

11.  Greif,  I.,  Semantics  of  Communicating  Parallel  Processes,  MAC  TR-164,  Pro- 
ject MAC.  M.I.T.,  September  1976. 

12.  Gule,  J.,  S.M.  thesis.  Department  of  Electrical  Engineering  and  Computer  Sci- 
ence, M.I.T.,  In  preparation. 

13.  Henderson,  DJL,  The  Binding  Models  A Semantic  Beee  for  Modular  Program- 
ming Systems,  MAC  TR-146,  Project  MAC,  M.I.T..  February  1976. 

14.  Hewitt,  C.,  "Protection  and  Synchronization  in  Actor  Systems,"  A.I.  Working 
Paper  83,  Artificial  Intelligence  Laboratory,  M.I.T.,  November  1974. 


16.  Hewitt,  C.,  "Viewing  Control  Structures  as  Patterns  of  Passing  Messages," 
A.I.  Working  Paper  92,  Artificial  Intelligence  Laboratory,  M.I.T.,  April  1070. 


16.  Hewitt,  C.,  and  Baker,  H.,  "Laws  for  Communicating  Parallel  Processes,"  A.I. 
Working  Paper  134A,  Artificial  Intelligence  Laboratory,  M.I.T.,  May  1977. 

1 7.  Knuth,  D.,  Fundamental  Algorithms,  Volume  1 of  The  Art  of  Computer  Program- 
ming, Addlson-Wesley,  Reading,  Mass.,  February  1 976,  pp.  406-420. 

18.  Lamport,  L.,  "Time,  Clocks,  and  the  Ordering  of  Events  In  a Distributed  Sys- 
tem," Massachusetts  Computer  Associates  Technical  Report  CA-7603-201 1, 
March  1976. 

19.  Learning  Research  Group,  Personal  Dynamic  Media,  Xerox  PARC  Report 
SSL76-1,  1976. 

20.  Liskov,  B.,  Snyder,  A.,  Atkinson,  R.„  and  Schaffert,  C.,  "Abstraction  Mechan- 
isms in  CLU,"  Laboratory  for  Computer  Science  Computation  Structures  Group 
Memo  144-1,  M.I.T.,  January  1977. 

21.  Metcalfe,  R.,  Packet  Communication,  MAC  TR-114,  Project  MAC,  M.I.T., 
December  1973. 

22.  Metcalfe,  R.,  and  Boggs,  D.,  Ethernet:  Distributed  Packet  Switching  for  Local 
Computer  Networks,  Xerox  PARC  Report  CSL76-7,  November  1 976. 

23.  Omstein,  S.M.,  et  al.,  "Plurlbus-A  Reliable  Multiprocessor,"  Proceedings  of 
the  National  Computer  Conference,  May  1 976. 

24.  Reed,  D.P.,  and  Kanodia,  R.K.,  Eventcounts:  A New  Model  for  Process  Syn- 
chronization, Project  MAC  Computer  Systems  Research  Division  RFC#  102, 
M.I.T.,  January  1976. 

26.  Rowe,  L.A.,  "The  Distributed  Computing  Operating  System,”  Department  of 
information  and  Computer  Science  Technical  Report  66,  University  of  Califor^ 
nia  at  Irvine,  June  1 976. 

26.  Rowe,  L.A.,  Hopwood,  M.D.,  and  Farber,  D.J.,  "Software  Methods  for  Achieving 
FalFSoft  Behavior  in  the  Distributed  Computing  System,"  Proceedings  of  the 
IEEE  Symposium  on  Computer  Software  Reliability,  1973. 

27.  Rumbaugh,  J.,  A Parallel  Asynchronous  Computer  Architecture  for  Data  Flow 
Programs,  MAC  TR-160,  Project  MAC,  M.I.T.,  May  1976. 

28.  Steams,  R.E.,  Lewis,  P.M.,  and  Rosenkrantz,  D.J.,  "Concurrency  Control  for 
Database  Systems,"  IEEE  Symposium  on  Foundations  of  Computer  Science 
CM1133-8C,  October  1976. 

29.  Steele,  G.,  "LAMBDA:  The  Ultimate  Declarative,"  A.I.  Memo  379,  Artificial 
Intelligence  Laboratory,  M.I.T.,  November  1976. 


References 


171. 


30.  Steele,  6.,  end  Sussmen,  G.,  "LAMBDA:  The  Ultimate  Imperative,"  A.I.  Memo 
363,  Artiflcial  Intelligence  Laboratory,  M.I.T.,  March  1076. 

31.  Steiger,  R.,  Actor  Machine  Architecture,  S.M.  Thesia,  Department  of  Electrical 
Engineering.  M.I.T..  May  1074. 

32.  Strachey,  C.,  and  Wadsworth,  C.P.,  "Continuations:  A Mathematical  Semantics 
for  Handling  Full  Jumps,"  Technical  Monograph  PRG>11,  Oxford  University 
Computing  Laboratory,  January  1 074. 

33.  Sussman,  G.,  and  Steele,  G.,  "SCHEME:  An  Interpreter  for  Extended  Lambda 
Calculus,"  A.I.  Memo  340,  Artificial  Intelligence  Laboratory,  M.I.T.,  December 
1076. 

34.  Ward,  S.,  Functional  Domains  of  Applicative  Lartguagea,  MAC  TR-130,  Project 
MAC,  M.i.T.,  September  1074. 

36.  Ward,  S.,  and  Halstead,  R.,  "A  Syntactic  Theory  of  Message  Passing,"  inter* 
nal  memorandum,  M.I.T.,  September  1076. 

36.  Wulf,  W.,  and  Levin,  R.,  "C.mmp-A  MultHMini-Processor,"  AFIPS  Conference 
Proceedings,  Fall  1072. 

37.  Wulf,  W.,  at  al.,  "HYDRA:  The  Kernel  of  a Multiproceasor  Operating  System," 
Communications  of  the  ACM,  June  1074. 


172. 


References 


Official  Distribution  List 


Defense  Documentation  Center 
Cameron  Station 

Alexandria,  Va  22314  12  copies 

Office  of  Naval  Research 
Information  Systems  Program 
Code  437 

Arlington,  Va  22217  2 copies 


New  York  Area  Office 

715  Broadway  - 5th  floor 

New  York,  N.  Y.  10003  1 copy 


Naval  Research  Laboratory 
Technical  Information  Division 
Code  2627 

Washington,  D.  C.  20375  6 copies 


Office  of  Naval  Research 

Dr.  A.  L.  Slafkosky 

Code  102IP 

Scientific  Advisor 

Arlington,  Va  22217 

6 copies 

Coinmandant  of  the  Marine  Corps 

Office  of  Naval  Research 
Code  200 

Arlington,  Va  22217 

1 copy 

(Code  RD-1) 

Washington,  D.  C.  20380  1 copy 

Naval  Electronics  Laboratory  Center 

Office  of  Naval  Research 

Advanced  Software  Technology  Division 

Code  5200 

San  Diego,  Ca  92152  1 copy 

Code  455 

Arlington,  Va  22217 

1 copy 

Office  of  Naval  Research 

Mr.  E.  H.  Glelssner 

Naval  Ship  Research  & Development  Center 
Computation  & Mathematics  Department 

Code  458 

Bethesda,  Md  20084  1 copy 

Arlington,  Va  22217 

1 copy 

Office  of  Naval  Research 

Captain  Grace  M.  Hopper 

NAICOM/MIS  Planning  Branch  (0P-916D) 

Branch  Office,  Boston 

Office  of  Chief  of  Naval  Operations 

495  Sommer  Street 

Washington,  D.  C.  20350  1 copy 

Boston,  Ha  02210 

1 copy 

Office  of  Naval  Research 

Mr.  Kin  B.  Thompson 

Technical  Director 

Branch  Office,  Chicago 

Information  Systems  Division  (0P-91T) 

536  South  Clark  Street 

Office  of  Chief  of  Naval  Operations 

Chicago,  11  60605 

1 copy 

Washington,  D.  C.  20350  1 copy 

Office  of  Naval  Research 

Branch  Office,  Pasadena 

1030  East  Green  Street 

Pasadena,  Ca  91106  1 copy 


